<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>drew</title><description>Thoughts on life, tech, and math.</description><link>https://drewstone.github.io/</link><item><title>Lifting Auto-Research</title><link>https://drewstone.github.io/posts/lifting-auto-research/</link><guid isPermaLink="true">https://drewstone.github.io/posts/lifting-auto-research/</guid><description>Auto-research is a loop. Letting an agent optimize that loop lifts it one level up. Doing this recursively builds a tower. Whether the tower ascends or just spins depends on a single property: grounding.</description><pubDate>Mon, 27 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Autonomous Autoreserach</title><link>https://drewstone.github.io/posts/autonomous-autoreserach/</link><guid isPermaLink="true">https://drewstone.github.io/posts/autonomous-autoreserach/</guid><pubDate>Sat, 25 Apr 2026 00:00:00 GMT</pubDate></item><item><title>How I rebuilt the blog</title><link>https://drewstone.github.io/posts/how-i-rebuilt-the-blog/</link><guid isPermaLink="true">https://drewstone.github.io/posts/how-i-rebuilt-the-blog/</guid><pubDate>Sat, 25 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Convergence as a first-class eval primitive</title><link>https://drewstone.github.io/posts/convergence-as-eval-primitive/</link><guid isPermaLink="true">https://drewstone.github.io/posts/convergence-as-eval-primitive/</guid><description>Binary pass/fail is useless signal for multi-turn agents. Replace it with continuous completion curves, monotone progress, and resumable runs.</description><pubDate>Fri, 24 Apr 2026 00:00:00 GMT</pubDate></item><item><title>The ensemble and the edit</title><link>https://drewstone.github.io/posts/the-ensemble-and-the-edit/</link><guid isPermaLink="true">https://drewstone.github.io/posts/the-ensemble-and-the-edit/</guid><description>Two ways to render a workflow agent in a rich chat UI, and why you probably want both.</description><pubDate>Thu, 23 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Teaching Agents to Improve Themselves</title><link>https://drewstone.github.io/posts/self-improving-agents/</link><guid isPermaLink="true">https://drewstone.github.io/posts/self-improving-agents/</guid><description>We built four composable skills that turn any coding agent into an autonomous improvement loop. Here is how they work and what they found.</description><pubDate>Wed, 18 Mar 2026 00:00:00 GMT</pubDate></item><item><title>RL Without Gradients</title><link>https://drewstone.github.io/posts/agentic-eval-improvement/</link><guid isPermaLink="true">https://drewstone.github.io/posts/agentic-eval-improvement/</guid><description>Agents cannot update their own weights. But they can change their prompts, tools, memory, and planning strategies. What does the outer optimization loop look like?</description><pubDate>Mon, 16 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Sandboxes All the Way Down</title><link>https://drewstone.github.io/posts/building-on-tangle/</link><guid isPermaLink="true">https://drewstone.github.io/posts/building-on-tangle/</guid><description>AI agents need isolated compute. Building the infrastructure that provisions it, meters it, and stays out of the way.</description><pubDate>Sun, 15 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Multi-Agent Orchestration with Convergence Loops</title><link>https://drewstone.github.io/posts/deepwork-orchestrator/</link><guid isPermaLink="true">https://drewstone.github.io/posts/deepwork-orchestrator/</guid><description>Draft, review, revise, repeat. The hard part is not the loop. It is keeping agent sessions coherent across iterations.</description><pubDate>Sat, 14 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Anatomy of an Autonomous Security Audit</title><link>https://drewstone.github.io/posts/redteam-architecture/</link><guid isPermaLink="true">https://drewstone.github.io/posts/redteam-architecture/</guid><description>We tried one big agent with every security tool. It was terrible. Here is what actually works.</description><pubDate>Fri, 13 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Vibecoding a Browser Agent</title><link>https://drewstone.github.io/posts/vibecoding-a-browser-agent/</link><guid isPermaLink="true">https://drewstone.github.io/posts/vibecoding-a-browser-agent/</guid><description>We gave Claude Code the directive to build its own experimentation harness, run tests, measure regressions, and iterate autonomously. It works.</description><pubDate>Wed, 11 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Convergence in Multi-Agent Review Loops</title><link>https://drewstone.github.io/posts/convergence-loops/</link><guid isPermaLink="true">https://drewstone.github.io/posts/convergence-loops/</guid><description>When you have AI agents writing and reviewing each other, how do you know when to stop? The math of iterative quality convergence.</description><pubDate>Tue, 10 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Building a Browser Agent That Doesn&apos;t Get Stuck</title><link>https://drewstone.github.io/posts/browser-agent-stuck-detection/</link><guid isPermaLink="true">https://drewstone.github.io/posts/browser-agent-stuck-detection/</guid><description>Detecting when an autonomous browser agent is going in circles, and what to do about it.</description><pubDate>Sat, 07 Mar 2026 00:00:00 GMT</pubDate></item><item><title>The Expected Cost of Fallback Chains</title><link>https://drewstone.github.io/posts/provider-fallback-chains/</link><guid isPermaLink="true">https://drewstone.github.io/posts/provider-fallback-chains/</guid><description>When you route AI requests across 40 providers with retries and fallbacks, what does the cost distribution actually look like?</description><pubDate>Thu, 05 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Exploit-or-Disprove: Adversarial Validation of Security Findings</title><link>https://drewstone.github.io/posts/exploit-or-disprove/</link><guid isPermaLink="true">https://drewstone.github.io/posts/exploit-or-disprove/</guid><description>Automated security auditing produces false positives. The fix is a second agent whose only job is to write a working exploit or downgrade the finding.</description><pubDate>Tue, 03 Mar 2026 00:00:00 GMT</pubDate></item><item><title>One API for Eight Browser Backends</title><link>https://drewstone.github.io/posts/session-multiplexing/</link><guid isPermaLink="true">https://drewstone.github.io/posts/session-multiplexing/</guid><description>Chrome, Safari, iOS simulators, Android emulators, physical devices. How we built a unified session layer for all of them.</description><pubDate>Wed, 25 Feb 2026 00:00:00 GMT</pubDate></item></channel></rss>