drew

drewThoughts on life, tech, and math.https://drewstone.github.io/Lifting Auto-Researchhttps://drewstone.github.io/posts/lifting-auto-research/https://drewstone.github.io/posts/lifting-auto-research/Auto-research is a loop. Letting an agent optimize that loop lifts it one level up. Doing this recursively builds a tower. Whether the tower ascends or just spins depends on a single property: grounding.Mon, 27 Apr 2026 00:00:00 GMTAutonomous Autoreserachhttps://drewstone.github.io/posts/autonomous-autoreserach/https://drewstone.github.io/posts/autonomous-autoreserach/Sat, 25 Apr 2026 00:00:00 GMTHow I rebuilt the bloghttps://drewstone.github.io/posts/how-i-rebuilt-the-blog/https://drewstone.github.io/posts/how-i-rebuilt-the-blog/Sat, 25 Apr 2026 00:00:00 GMTConvergence as a first-class eval primitivehttps://drewstone.github.io/posts/convergence-as-eval-primitive/https://drewstone.github.io/posts/convergence-as-eval-primitive/Binary pass/fail is useless signal for multi-turn agents. Replace it with continuous completion curves, monotone progress, and resumable runs.Fri, 24 Apr 2026 00:00:00 GMTThe ensemble and the edithttps://drewstone.github.io/posts/the-ensemble-and-the-edit/https://drewstone.github.io/posts/the-ensemble-and-the-edit/Two ways to render a workflow agent in a rich chat UI, and why you probably want both.Thu, 23 Apr 2026 00:00:00 GMTTeaching Agents to Improve Themselveshttps://drewstone.github.io/posts/self-improving-agents/https://drewstone.github.io/posts/self-improving-agents/We built four composable skills that turn any coding agent into an autonomous improvement loop. Here is how they work and what they found.Wed, 18 Mar 2026 00:00:00 GMTRL Without Gradientshttps://drewstone.github.io/posts/agentic-eval-improvement/https://drewstone.github.io/posts/agentic-eval-improvement/Agents cannot update their own weights. But they can change their prompts, tools, memory, and planning strategies. What does the outer optimization loop look like?Mon, 16 Mar 2026 00:00:00 GMTSandboxes All the Way Downhttps://drewstone.github.io/posts/building-on-tangle/https://drewstone.github.io/posts/building-on-tangle/AI agents need isolated compute. Building the infrastructure that provisions it, meters it, and stays out of the way.Sun, 15 Mar 2026 00:00:00 GMTMulti-Agent Orchestration with Convergence Loopshttps://drewstone.github.io/posts/deepwork-orchestrator/https://drewstone.github.io/posts/deepwork-orchestrator/Draft, review, revise, repeat. The hard part is not the loop. It is keeping agent sessions coherent across iterations.Sat, 14 Mar 2026 00:00:00 GMTAnatomy of an Autonomous Security Audithttps://drewstone.github.io/posts/redteam-architecture/https://drewstone.github.io/posts/redteam-architecture/We tried one big agent with every security tool. It was terrible. Here is what actually works.Fri, 13 Mar 2026 00:00:00 GMTVibecoding a Browser Agenthttps://drewstone.github.io/posts/vibecoding-a-browser-agent/https://drewstone.github.io/posts/vibecoding-a-browser-agent/We gave Claude Code the directive to build its own experimentation harness, run tests, measure regressions, and iterate autonomously. It works.Wed, 11 Mar 2026 00:00:00 GMTConvergence in Multi-Agent Review Loopshttps://drewstone.github.io/posts/convergence-loops/https://drewstone.github.io/posts/convergence-loops/When you have AI agents writing and reviewing each other, how do you know when to stop? The math of iterative quality convergence.Tue, 10 Mar 2026 00:00:00 GMTBuilding a Browser Agent That Doesn't Get Stuckhttps://drewstone.github.io/posts/browser-agent-stuck-detection/https://drewstone.github.io/posts/browser-agent-stuck-detection/Detecting when an autonomous browser agent is going in circles, and what to do about it.Sat, 07 Mar 2026 00:00:00 GMTThe Expected Cost of Fallback Chainshttps://drewstone.github.io/posts/provider-fallback-chains/https://drewstone.github.io/posts/provider-fallback-chains/When you route AI requests across 40 providers with retries and fallbacks, what does the cost distribution actually look like?Thu, 05 Mar 2026 00:00:00 GMTExploit-or-Disprove: Adversarial Validation of Security Findingshttps://drewstone.github.io/posts/exploit-or-disprove/https://drewstone.github.io/posts/exploit-or-disprove/Automated security auditing produces false positives. The fix is a second agent whose only job is to write a working exploit or downgrade the finding.Tue, 03 Mar 2026 00:00:00 GMTOne API for Eight Browser Backendshttps://drewstone.github.io/posts/session-multiplexing/https://drewstone.github.io/posts/session-multiplexing/Chrome, Safari, iOS simulators, Android emulators, physical devices. How we built a unified session layer for all of them.Wed, 25 Feb 2026 00:00:00 GMT