Worth experimenting with, but not production-ready. Agent Teams is Claude Code’s most ambitious coordination feature: multiple Claude instances working in parallel with peer-to-peer communication and shared task lists. The architecture is sound. The implementation has critical bugs (message delivery failures in VS Code and tmux, file conflict risks). Best use today: bounded experiments on greenfield modules with clear file ownership boundaries. Eric should try it on one Donna feature sprint to calibrate — but keep Cursor’s built-in Task subagents as the daily workhorse.
A Claude Code session acts as “team lead” and spawns independent teammates, each running as a full Claude instance with its own context window. Teammates communicate directly with each other through messages and coordinate via a shared task list. Unlike subagents (which are fire-and-forget workers reporting results back), teammates can discuss, challenge, and build on each other’s work mid-task. Enabled with one environment variable: CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. Announced Feb 5, 2026 alongside Opus 4.6.1
| Dimension | Rating | Detail |
|---|---|---|
| Maturity | Experimental | Shipped Feb 5 2026. Disabled by default. Known messaging bugs in VS Code and tmux backends.2,3 |
| Documentation | Adequate | Official docs on code.claude.com cover basics. Community guides (claudefast, Medium) fill gaps. No official troubleshooting guide.1,4 |
| Community | Growing | r/ClaudeAI and r/ClaudeCode active. GitHub issues being filed and triaged. 3 weeks old — too early for established patterns.5,6 |
| Adoption | Early Adopter | Nicholas Carlini’s 16-agent C compiler is the flagship demo ($20K, 100K lines, 2 weeks). Real-world production use reports are thin.7,8 |
| Project | Fit | Rationale |
|---|---|---|
| Donna multi-tenant build | Strong | Donna needs parallel work across channel adapters (WA, Telegram, email). Each adapter = independent file set = ideal for agent teams. |
| Sourcy bot iteration | Weak | Single-file prompt engineering. Sequential by nature. Subagents better. |
| Talent Coop math pipeline | Moderate | Could parallelize question gen + grading + curriculum mapping. But pipeline is sequential — benefits limited. |
| PCRM research reports | Weak | Research is sequential (frame → search → synthesize). Single session with good context is better. |
| Wenhao blue-collar app | Strong | Frontend (React Native) + backend (API) + vision pipeline = three independent layers. Natural team split. |
Agent Teams uses a flat peer-to-peer topology. One session acts as team lead (spawns teammates and synthesizes results), but teammates communicate directly with each other — there is no hierarchy beyond the lead’s spawn privilege. All agents share the same filesystem (git repository).
┌─────────────────────────────────────────────┐ │ TEAM LEAD │ │ (orchestration, task assignment, synthesis) │ │ │ │ Shared Task List ↔ All teammates read/write │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │Teammate A│↔│Teammate B│↔│Teammate C│ │ │ │(frontend)│ │(backend) │ │(tests) │ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ ↕ ↕ ↕ │ │ Shared Git Repository │ └─────────────────────────────────────────────┘
The flat structure means any teammate can message any other teammate directly. The lead doesn’t relay messages — it orchestrates task assignment and collects final outputs. The shared filesystem (git repo) is the implicit coordination layer: all agents read and write files in the same directory tree.
1. Spawning. The team lead creates teammates via natural language. Each gets its own context window, its own tool permissions, and its own conversation history. Teammates are spawned with a specific task description that scopes their work. The lead can spawn as many teammates as needed, though practical limits emerge around 4–6 due to coordination overhead.1
2. Communication. Three channels exist: (a) direct messages between any two agents, (b) broadcast from lead to all teammates, and (c) a shared task list readable and writable by all. Messages are file-based — each agent has an inbox file on disk, with JSON entries carrying read/unread flags. This is the fragile part: message delivery bugs are the #1 reported issue. In VS Code, teammate messages never surface to the team lead. In tmux on macOS, teammates never poll their inbox files.2,3
3. Task List. A shared data structure that any teammate can read, claim, and update. The lead uses it to track progress and reassign work. Tasks should be sized at 5–6 per teammate for optimal coordination — too few and agents idle waiting; too many and context thrashing eats the gains.4
4. Delegate Mode. Activated by pressing Shift+Tab. Restricts the lead to coordination only — no code writing. This is critical for setups with 4+ teammates: without delegate mode, the lead gets pulled into implementation details and stops coordinating. The result is a traffic jam where teammates block on the lead’s attention.9
| Dimension | Subagents | Agent Teams |
|---|---|---|
| Architecture | Hub-and-spoke (subagent reports to main) | Peer-to-peer (teammates message each other)1,10 |
| Context | Subagent gets fresh context, returns result | Each teammate maintains persistent context across wake cycles10 |
| Communication | Result-only (no mid-task discussion) | Direct messaging + broadcast + shared tasks1 |
| Token cost | ~20K overhead per spawn | ~2× single session (each teammate = full Claude instance)10,11 |
| Best for | Focused tasks, clear input→output | Complex work requiring discussion and collaboration1 |
| Failure mode | Subagent fails silently, main continues | Teammate messaging fails → deadlock2,3 |
Each teammate runs as a separate Claude Code process. Communication happens via inbox files on disk — JSON files with read/unread flags that agents poll periodically. The shared filesystem is a shared git repository: all agents push and pull from the same working tree.
In Carlini’s 16-agent compiler project, the setup was 16 Docker containers sharing a git repo, with lock files for task claiming to avoid race conditions.7
Token consumption scales with team size. Each teammate loads full project context + conversation history + messages from other agents. For a 3-agent team on a medium project, expect 300K–500K tokens per session. The messaging overhead alone can account for 15–20% of total tokens, as every message is re-read on each polling cycle.
| Scenario | Cost | Source |
|---|---|---|
| Single session (daily avg) | ~$6/day | Anthropic official11 |
| Single session (90th percentile) | ~$12/day | Anthropic official11 |
| Agent Teams (estimated) | ~$12–24/day | 2× multiplier from persistent contexts + messaging overhead10,11 |
| Carlini 16-agent compiler | ~$20,000 / 2 weeks | ~$1,430/day across 16 agents7 |
| Penny’s approach (for reference) | ~$40/month total | $20 ChatGPT + $20 Claude, disciplined usage13 |
Observed failure patterns from Reddit and GitHub issue reports:
--dangerously-skip-permissions flag, but security tradeoff5| Dimension | Status | Detail |
|---|---|---|
| Error handling | Missing | No graceful recovery from teammate crashes |
| Logging/observability | Partial | In-process mode shows teammate status; no structured logs |
| Rate limiting | Mature | Inherits Claude Code rate limits per org |
| Session management | Missing | No resumption, no persistence |
| Security model | Partial | Permission inheritance broken; --dangerously-skip-permissions is the workaround |
| Rollback/recovery | Missing | Teammate failure = lost work |
Success stories:
Failure stories:
--dangerously-skip-permissions.5Workarounds practitioners use:
--dangerously-skip-permissions or sandbox with auto-allow bash5“16 agents built a C compiler!” is the headline. Anthropic shipped Agent Teams alongside Opus 4.6 — the message is clear: this is the future of AI-assisted development. The community is excited about “directing a team” instead of “prompting a tool.”
| Challenge | Question | Evidence |
|---|---|---|
| Inversion | “What if single-session Claude with good prompts beats Agent Teams for 95% of real tasks?” | The same-app comparison showed Agent Teams added coordination overhead for marginal architectural improvement. Most real coding work is sequential reasoning, not parallel execution. Eric’s own work (research, prompt engineering, bot iteration) is fundamentally sequential.6,14 |
| Base rates | “What’s the historical success rate for multi-agent coordination?” | Microsoft’s AutoGen, CrewAI, LangGraph agents — the graveyard of multi-agent frameworks is vast. Coordination is the hard problem in distributed systems, not parallelism. Agent Teams inherits this challenge.16 |
| Survivorship | “Are we only hearing from people who made it work?” | Carlini’s compiler project had ideal conditions: well-defined spec, comprehensive test suite, independent compilation units. The Rails migration failure is more representative of real projects. Selection bias toward impressive demos. |
| Incentive mapping | “Who benefits from us adopting Agent Teams?” | Anthropic. More agents = more tokens = more revenue. The 2× cost multiplier is a feature for them, not a bug. Agent Teams is a token-consumption accelerator marketed as a productivity tool.10,11 |
| Time horizon | “Is this ready now or eventually?” | The messaging bugs are not edge cases — they affect VS Code (the most popular editor) and tmux (the standard terminal multiplexer). These are fundamental infrastructure issues, not polish. Estimate 2–4 months before basic reliability. |
For now, TRIAL on bounded experiments only. Eric’s daily workhorse should remain Cursor’s Task subagents — they’re reliable, lower cost, and sufficient for 90%+ of his work.
npm install -g @anthropic-ai/claude-code)export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
claude
“Create a team: one agent for the API routes in src/api/, one for the React components in src/components/, one for tests in tests/. Each owns their directory exclusively.”
| Parameter | Recommendation |
|---|---|
| Recommended first use | A new feature with 2–3 independent file areas |
| Scope | 2 teammates + lead in delegate mode |
| Success criteria | Both teammates complete their tasks without file conflicts or deadlocks |
| Time estimate | 30–60 minutes including setup |
| Example | “Add a settings page (frontend teammate) and settings API endpoint (backend teammate) with tests (lead coordinates).” |
--dangerously-skip-permissions or configure allowlists in advance5Worth experimenting with. Allocate a bounded 2-hour session to test on one real task with clear file ownership boundaries. Do not adopt for daily workflow until message delivery bugs are fixed (estimate: 1–2 months).
Try first Donna multi-tenant build. Three channel adapters (WA, Telegram, email) are independent file sets. Spawn 3 teammates, one per adapter. Test on next Donna sprint. If it works, this becomes the template for all multi-adapter builds.
Second experiment Wenhao blue-collar app. Frontend + backend + vision pipeline split is natural. But coordinate via git — don’t rely on Agent Teams messaging (use the Carlini pattern: shared repo, lock files).
Skip Sourcy / research / PCRM. Sequential work. Single session or Cursor Task subagents are better.
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 — 5 minutes