140 Pull Requests in a Week: Orchestrating Claude Code Agents at a Hackathon

Posted on June 12, 2026 schedule 10 min read

Claude CodeAIDeveloper ToolsProductivity

140 Pull Requests in a Week: Orchestrating Claude Code Agents at a Hackathon

It’s been quiet on this blog. The honest reason is a demanding client project that ate every spare hour for months. But last week broke the silence in the best possible way: our team spent it at a hackathon that capped off a renovation we’d been grinding on for several months — and we came out the other side having opened more than 140 pull requests in that single week.

We did it with massive AI assistance. What makes the number almost absurd: we’d only had Claude Code in our hands for the last three weeks. It didn’t hurt that Claude’s new Mythos model landed mid-week, dropping a fresh capability bump right into the middle of the sprint. We were so aggressive about it that we exhausted the token limit on our enterprise Claude subscription twice. (If you’ve never seen a rate-limit wall mid-sprint, it’s a humbling experience — and a strangely good sign you’re using the tool to its limit.)

Here’s the thing I want to share, because it’s the part that actually mattered: the throughput didn’t come from typing faster. It came from orchestration. Once you stop thinking of an AI assistant as one chat window and start thinking of it as a fleet you direct, the bottleneck moves from “how fast can I write code” to “how many independent problems can I keep in flight at once.”

This post is about the mechanics of that — running Claude Code agents in parallel, driving them without losing your mind, what persists when things reload, how agents and skills work together, and a small but mighty status line that keeps you oriented through all of it.

From One Assistant to Many

The mental shift is this: a single Claude Code conversation is a worker. Useful, but serial. The leverage comes from running many of them.

There are two distinct things people mean by “agents,” and conflating them causes confusion:

Subagents — Claude dispatches these itself, mid-task, to handle a side quest. Each runs in its own context window, does its job (searching the codebase, reviewing a diff, gathering logs), and returns a summary to the main conversation. They keep the main thread’s context clean. A subagent can’t spawn further subagents — the nesting is one level deep.
Agent View (claude agents) — a dashboard you drive, where you launch and monitor multiple independent background sessions. Each is a full Claude Code session running on your machine. This is the one that turns a hackathon into a fleet operation: ten problems, ten sessions, each chewing through its own branch.

At the hackathon, the pattern that won was simple. Decompose the work into chunks that don’t share state — “migrate this module,” “rewrite these tests,” “add this endpoint” — and hand each to its own session. Independent problems are the unit of parallelism. The moment two agents need to touch the same file, you’ve created a merge conflict you’ll pay for later, so the upfront decomposition is the real skill.

Tip: When agents must work on overlapping code, run them in isolated git worktrees so each gets its own checkout. Claude Code supports worktree isolation for exactly this reason — parallel edits that would otherwise collide.

Driving Agents Without Losing Your Mind

Running one agent is easy. Running six means you need to move between them fast and never lose track of what’s where. That’s entirely a keyboard-shortcut problem, and it’s worth burning the muscle memory in early.

The ones that earned their keep:

Shortcut	What it does
`Ctrl+O`	Toggle verbose transcript — see the full tool calls, not just the summary
`Ctrl+B`	Background the current task so you can go do something else
`Shift+Tab`	Cycle permission modes (plan / accept-edits / etc.)
`Esc`	Interrupt — stop the agent mid-thought to course-correct

Inside Agent View itself, navigation is its own little world:

Key	Action
`↑` / `↓`	Move between sessions
`Space`	Open the peek panel for the highlighted session
`Enter`	Attach to a session (or dispatch, if you’ve typed a prompt)
`Ctrl+R`	Rename a session — give it a name you’ll recognize at a glance
`Ctrl+X`	Stop / kill a session you’re done with

A word of caution worth printing in bold: keybindings change between Claude Code versions, and several of these are recent additions. Don’t memorize a blog post — run /keybindings in your own install to see the authoritative list for your version, and customize it there. The Esc-to-interrupt habit alone is worth forming: the single biggest time sink with AI agents is letting one run three minutes down the wrong path before you stop it.

Persistence: What Survives a Reload

A real fear when you’ve got six sessions running is: what happens when I close my laptop, restart the CLI, or the conversation gets compacted? During a multi-day hackathon this stops being hypothetical.

Here’s the rough mental model:

Background agent sessions persist on disk. A supervisor process keeps them alive, and the CLI reconnects to them after a restart. They survive your machine going to sleep. They do not survive a full shutdown of the machine.
Subagent transcripts persist — the work a subagent did isn’t lost when it returns; the transcript is written out and can be picked back up.
Looping / scheduled tasks are session-scoped and expire. A recurring task you set up with /loop lives in its conversation and auto-expires after about 7 days unless you resume it. So “set it and forget it forever” isn’t the model — “set it and check back this week” is.

The practical upshot: lean on background sessions for in-flight work you want to come back to, but if you need something to run reliably for weeks regardless of your laptop’s state, reach for cloud-side scheduling (more on that below) rather than a local loop.

Agents and Skills, Working Together

Agents are who does the work. Skills are how they do it — reusable instructions that load on demand. I wrote a whole post on skills earlier; here I want to show how they combine with agents in a real workflow.

The decision is simpler than it looks:

Reach for a skill when you have reusable instructions or a repeatable procedure (deploy steps, a commit convention, your team’s test patterns). Skills auto-trigger when your request matches their description, or you invoke one directly with /skill-name.
Reach for a custom agent (.claude/agents/*.md) when you want an isolated specialist with its own system prompt, restricted tool access, or a different model. The frontmatter is small — name, description, optional tools and model — and you summon one by @-mentioning it or letting Claude delegate.

---
name: pr-reviewer
description: Reviews open pull requests for correctness and convention violations.
tools: Read, Grep, Glob, Bash
model: sonnet
---

You are a focused PR reviewer. For each open PR, check...

Three combinations did real work for us:

1. Skill libraries you didn’t write yourself. Superpowers is a plugin that ships a library of process skills — structured brainstorming, test-driven development, systematic debugging, plan execution. The value isn’t any single skill; it’s that the workflow discipline is encoded, so an agent doesn’t skip the design step or the verification step just because it’s eager. On a hackathon where you’re moving fast, having a skill that forces “write the design before the code” is a guardrail, not a brake.

2. Babysitting PRs on a loop. With 140 PRs flying, reviewing becomes the bottleneck, not writing. The /loop command lets you run a task on a recurring interval:

/loop 15m check open PRs, summarize new review comments, and flag anything that needs my decision

You go build something else; every 15 minutes an agent sweeps the PR queue and surfaces only what needs a human. Omit the interval and Claude paces itself. (Remember the 7-day expiry from above.)

3. Tickets as the source of truth via MCP. This is where it clicks into a real workflow. With an MCP server for your issue tracker — Linear, Jira, GitHub Issues — an agent can read a ticket directly, implement it, and link the PR back. The task definition lives where your team already manages it; the agent picks it up from there instead of you copy-pasting requirements into a chat. MCP servers are configured per project (or per agent), and they’re what turn “Claude in a terminal” into “Claude wired into your actual toolchain.”

The compounding effect is the point: a custom agent, running a disciplined skill, fed by a ticket from MCP, opening a PR that another looping agent reviews. That’s the assembly line that produces 140 PRs in a week.

Bonus: A Status Line That Keeps You Oriented

When you’re running a fleet, the single most disorienting thing is losing track of where you are — which model, how much context is left, whether you’re about to hit the weekly limit (ask me how I know), and crucially which git branch you’re on. A custom status line fixes all of it.

Claude Code lets you point statusLine at a command that receives session JSON on stdin and prints a one-liner. Here’s the shape of mine, configured in ~/.claude/settings.json:

{
  "statusLine": {
    "type": "command",
    "command": "~/.claude/statusline.sh",
    "refreshInterval": 30
  }
}

The full script is bundled with this post — grab statusline.sh here and drop it in ~/.claude/. It reads the JSON and renders something like:

🤖 Opus 4.8 | 📁 saasforge 🌿 main• | 47% | 📅 62% | ⏱️ 1h 12m | +340/-90 | $4.21

Every segment is there for a reason that multi-agent work makes obvious:

Model (🤖 Opus 4.8) — when you’re juggling sessions on different models, you want to know which one is in front of you.
Directory + branch (📁 saasforge 🌿 main•) — the • flags uncommitted changes. This is a tripwire: if it ever shows the main checkout when you expected a feature worktree, your context drifted and you’re about to commit to the wrong place.
Context % (47%) — computed from the transcript’s token usage, green→yellow→red as it fills. The number that tells you when a compaction is coming.
Weekly limit % (📅 62%) — the rolling 7-day subscription usage. The gauge that would have warned me before I hit the wall. Twice.
Duration, line delta, cost — session bookkeeping that’s nice to have at a glance.

The implementation has a couple of details worth stealing. It pulls every field in one jq pass rather than re-parsing per field, and it joins them with the ASCII unit separator (0x1f) instead of a tab — because unlike tab, it isn’t an IFS whitespace character, so empty fields (a missing transcript path, say) are preserved rather than silently collapsed, keeping the read variable alignment intact. Small thing, but it’s the difference between a status line that’s robust and one that scrambles when a field is empty.

IFS=$'\037' read -r model model_id cost added removed dur_ms big transcript cwd week < <(
  printf '%s' "$input" | jq -r '[
    .model.display_name // "?",
    .model.id // "",
    .cost.total_cost_usd // 0,
    ...
  ] | map(tostring) | join("")'
)

It’s about a hundred lines of bash, and it’s the cheapest situational awareness you’ll ever buy.

What I Took Away

The hackathon reframed how I think about AI-assisted development. The skill ceiling isn’t prompting — it’s orchestration: decomposing work into independent pieces, running them in parallel, keeping just enough oversight that nothing drifts, and wiring the whole thing into the tools your team already uses.

The mechanics in this post — parallel agents, the shortcuts to drive them, persistence you can rely on, skills and MCP to give them context, and a status line to stay oriented — are the scaffolding. 140 PRs in a week is what the scaffolding makes possible. Two exhausted token limits is what it costs. Worth it.

And the number that still surprises me is the slope: in the last two weeks, feature parity on the renovated service went from 60% to over 90%. It’s all merged and deployed to production, running green at 99.99% availability. There’s still work — a handful of bugs left to squash before parity is truly complete — but the finish line is close enough to see. After that, going live is a business call, not an engineering one.

Further Reading:

Cover photo by Albert Sukhanov on Unsplash.

arrow_back Back to blog