I wanted to see what happens when you stop treating AI as a single assistant and start treating it as a team. So I set up five Claude Code instances running simultaneously, each with a different role, all working on the same codebase.
By the end of the day, the five instances had produced 18 commits, caught 3 bugs before they reached production, and had one spectacular coordination failure that taught me more than the successes did.
What You'll See in This Post
- How I set up five specialized Claude instances on one machine
- What each instance was responsible for
- The coordination problems I didn't expect
- Actual commit output and the bugs they caught
- Whether this is worth the complexity
The Setup
Five Claude Code sessions, each running in its own terminal on a Replit workspace. Same repo, same branch, different jobs:
Instance 1 — Developer. Builds features, writes code, creates adapters. The one that produces the most commits.
Instance 2 — Reviewer. Reviews every commit the developer makes. Checks for missing tests, hardcoded values, inconsistent error handling. Doesn't write code, only finds problems.
Instance 3 — Maintainer. Monitors external API changes, runs health checks, keeps documentation current. The quiet one that prevents future problems.
Instance 4 — Quality controller. Audits memory files, skill definitions, and cross-bot consistency. Makes sure the system doesn't accumulate technical debt.
Instance 5 — Coordinator. Decides priorities, assigns work, resolves conflicts, reflects on what's working and what isn't.
Each instance has its own CLAUDE.md section, its own memory directory, and its own set of skills. They communicate through Discord messages and git commits.
What Went Well
The reviewer caught things the developer missed. In one case, the developer built an adapter with a hardcoded API key format. The reviewer flagged it within minutes. Without a dedicated review instance, this would've gone to production.
Parallel work is genuinely faster. While the developer was building three new adapters, the maintainer was simultaneously checking that existing adapters still worked after upstream API changes. These are independent tasks that would've been sequential with one instance.
Specialization improves quality. When one instance only does reviews, it gets good at reviews. It develops patterns, remembers what to check, builds up a mental model of common mistakes. A generalist instance doing everything doesn't develop that depth.
The Coordination Failure
Here's where it got interesting. The developer committed a fix to an adapter. At the same time, the maintainer was running a health check that read the same file. The maintainer's check used the old version, found a "bug" that the developer had already fixed, and filed a report.
The coordinator saw the report and assigned the developer to fix it. The developer looked at the code, saw it was already fixed, and committed a "fix" that was actually a no-op. The reviewer reviewed the no-op commit and flagged it as suspicious.
Four instances spent 15 minutes on a problem that didn't exist, because of a message timing overlap.
⚠️ When multiple agents work on the same codebase, git commit hashes are the source of truth, not message order. We added a rule: before acting on any report, check
git logto see if the issue has already been addressed.
By the Numbers
- 18 commits in one working day
- 3 bugs caught by the reviewer before they could cause problems
- 1 false positive from the timing issue described above
- ~2x token usage compared to a single instance doing all the work
- Actual time saved: roughly 4 hours of sequential work compressed into parallel execution
What I'd Do Differently
💡 Start with two instances, not five. Developer + reviewer is the highest-value split. Add more only when you have clear, independent workstreams.
💡 Use git as the coordination layer. Discord messages can arrive out of order. Git commits are immutable and ordered. Every important decision should reference a commit hash.
💡 Give each instance a clear "don't touch" boundary. The developer doesn't review. The reviewer doesn't write code. The moment roles blur, you get duplicated work.
💡 Write handoff notes. When a session ends, each instance writes what it finished, what's blocked, and who's waiting on what. Without this, the next session starts cold.
Is It Worth It?
Honestly? For most tasks, a single Claude Code instance with good skills and hooks is enough. The parallel setup shines when you have genuinely independent workstreams that would take a long time sequentially.
The sweet spot is two to three instances: a developer and a reviewer at minimum, plus a coordinator if the project is complex enough. Five is possible but the coordination overhead starts eating into the productivity gain.
❌ Don't do this for a quick bug fix or a single feature. The setup time isn't worth it.
✅ Do consider it for large projects with multiple independent tracks: building new features while maintaining existing ones while monitoring external dependencies.
Want to Try It?
You need Claude Code and multiple terminal sessions. Each session runs independently with its own context.
- Set up a shared repo with clear directory boundaries
- Give each instance its own
.claude/skills/for role-specific behavior - Use
memory/bot-N/directories for instance-specific notes - Establish a communication channel (Discord, Slack, or just git commit messages)
- Write coordination rules in CLAUDE.md that all instances follow
Start with two instances. Add more when you feel the bottleneck.