CAUTION · EXPERIMENT RUNNING · CAUTION · EXPERIMENT RUNNING ·

Premise

This blog is a live experiment in human–AI co-authorship. Most posts here are drafted, edited, or revisited by AI agents under my direction. Every revision the system can observe is logged: the model that made it, when it landed, what changed, and (when captured) the full session trace that produced it.

Some posts are mine alone — hand-written, untouched by AI. The two are kept visually distinct. The rest of this page is the dashboard of the AI side.

At a glance

14
AI-touched posts
2 more written by hand
34
Tracked revisions
11 reconstructed from old commits
2
Distinct models
Opus 4.7 leads with 23
2
Human originals
0 human edits on AI posts

Authors

Every author who has touched a post — humans in green, models in violet/blue. Hover a row to see which posts they edited and when.

  • Opus 4.7 23 last Apr 27
    Posts touched (14)
    • Lifting Auto-Research
    • Convergence as a first-class eval primitive
    • The ensemble and the edit
    • Teaching Agents to Improve Themselves
    • RL Without Gradients
    • Sandboxes All the Way Down
    • +8 more
  • Opus 4.6 11 last Mar 18
    Posts touched (11)
    • Teaching Agents to Improve Themselves
    • RL Without Gradients
    • Sandboxes All the Way Down
    • Multi-Agent Orchestration with Convergence Loops
    • Anatomy of an Autonomous Security Audit
    • Vibecoding a Browser Agent
    • +5 more

Most edited posts

Posts the loop keeps coming back to. Hover the row to see the revision timeline. Each dot is one edit, colored by model, plotted on the post's lifetime.

Recent activity

The last 20 edits the system observed. Open the full trace log →

  1. Opus 4.7
    captured session · 2 asst turns · 1 tool calls
  2. Opus 4.7
    initial draft — StatGrid baseline, Scorecard for audit scenario, three-layer scoring, monotonicity, resumable runs
  3. Opus 4.7
    polish pass: Sidenote on PoC realism; closing rewritten from meta-commentary to concrete per-criterion diagnosis
  4. Opus 4.7
    diagram fix: legend moved below plot area so labels no longer cut off
  5. Opus 4.7
    polish pass: stray tex formula removed, narrator-smoothing example added showing three debugging cycles compressed into one clause
  6. Opus 4.7
    closing tightened to the concrete shift — one overnight run, six failure modes classified, four fixed, $30 of compute replacing two weeks of diagnostic work
  7. Opus 4.7
    10 asst turns, 9 tool calls captured
  8. Opus 4.7
    draft — first framing around linear extensions, rejected by operator as muddled
  9. Opus 4.7
    full rewrite — scrapped order-theory angle, reframed around shadcn-chat baseline and ensemble-vs-edit as real options
  10. Opus 4.7
    richness pass: built ChatMock component, added four rendered mockups inline (baseline / ensemble / edit / combined)
  11. Opus 4.7
    ChatMock redesigned: replaced wireframe transcript with rounded-card layout, rounded-lg tool pills with SVG icon badges, run_group collapsible containers
  12. Opus 4.7
    5 asst turns, 5 tool calls captured
  13. Opus 4.7
    composition diagram redrawn: /evolve becomes the outer container wrapping the three commands; phase track nested inside; switched palette to current B&W + action-color vocabulary
  14. Opus 4.7
    3 asst turns, 3 tool calls captured
  15. Opus 4.7
    2 asst turns, 2 tool calls captured
  16. Opus 4.7
    2 asst turns, 2 tool calls captured
  17. Opus 4.7
    4 asst turns, 2 tool calls captured
  18. Opus 4.7
    2 asst turns, 2 tool calls captured
  19. Opus 4.7
    1 asst turns, 1 tool calls captured
  20. Opus 4.7
    2 asst turns, 2 tool calls captured

The rules

AI may edit
  • Any post without original: true in its frontmatter
  • Frontmatter (title, description, tags) when revising content
  • Components in src/components/ and layouts
  • The trace-capture pipeline itself
AI must not edit
  • Any post with original: true — hard rule, in CLAUDE.md
  • The list of authors on a human post
  • Revision logs older than the agent's session start

Methodology

How the dashboard above is built.

  1. 1
    Capture

    A post-commit hook checks whether any src/content/posts/*.mdx file changed. If so, the script reads the active session log (Claude Code or Codex), extracts the model, timestamp, and a one-line change note, and appends a revision entry to the post's frontmatter with a trace_id.

  2. 2
    Trace store

    Each captured session is written to src/content/traces/<trace_id>.json — the full message stream, tool calls, and artifacts. The traces collection is loaded by Astro's content layer at build time.

  3. 3
    Reconstruction

    Posts written before the capture pipeline existed have a single reconstructed: true revision back-filled from the git history. Models for those entries are best-guess, marked clearly in the UI.

  4. 4
    Aggregation

    This page reads every post's frontmatter at build time, joins it with the trace store, and renders the dashboard. There is no database. Everything you see here is in the repo.