Multi-Agent Orchestration#
The practice of coordinating multiple LLM-based agents to work on tasks concurrently, with isolation, specialization, and collaboration.
Three Approaches Emerging#
Infrastructure-first: Scion#
scion positions itself as a “hypervisor for agents” — providing the infrastructure layer (containers, isolation, lifecycle management) while treating higher-level concerns as orthogonal. Harness-agnostic. Emphasizes human interaction as imperative.
Product-first: Kiro Autonomous Agent#
kiro’s autonomous agent is an opinionated product — a frontier-agent that handles the full stack from task intake to PR creation. Coordinates specialized sub-agents internally. Emphasizes autonomy and independence.
Tool-first: Claude Code#
claude-code approaches multi-agent from the individual tool outward — custom subagents, agent teams, and a rich extensibility stack (MCP, plugins, skills, hooks). Terminal-native, multi-platform. The agent itself can spawn and coordinate other agents.
Company-first: Paperclip#
paperclip operates above all three — orchestrating agents into companies with org charts, budgets, goals, governance, and accountability. Agent-agnostic (works with Claude Code, Codex, Cursor, etc.). Not an agent framework — it’s the organizational layer. “Manage business goals, not pull requests.”
Workspace-first: Gas Town#
gastown is a workspace manager that coordinates 20-30+ AI coding agents with persistent work tracking via git worktrees. A “Mayor” (Claude Code instance) orchestrates “Polecats” (worker agents) across project “Rigs.” Uniquely combines: git-worktree persistence, Bors-style merge queue, three-tier agent health monitoring, session discovery (Seance), and federated cross-workspace coordination (Wasteland). Agent-agnostic — supports Claude, Codex, Copilot, Gemini, Cursor, and others as runtimes.
Spec-first: Symphony#
symphony is OpenAI’s orchestration service published as a language-agnostic specification (78KB SPEC.md). Teams implement the protocol in their own language. Reads issues from Linear, creates per-issue isolated workspaces, runs Codex app-server sessions. WORKFLOW.md as single source of truth for prompt + config. Intentionally minimal: no database, no UI, scheduler/runner only. The coding agent handles all tracker mutations.
Platform-first: Multica#
multica treats coding agents as first-class teammates on a managed platform. Agents have profiles, show up on boards, post comments, report blockers, and compound reusable skills. Cloud-first (Next.js + Go + PostgreSQL) with self-hosting. Most vendor-neutral: supports 11 agent CLIs. Lighter governance than paperclip — Issues/Projects/Labels vs org charts/budgets. Thesis: two engineers + agent fleet = twenty.
Shared Patterns#
All three share:
- Isolated execution: Containers (Scion) / sandboxes (Kiro) / permission modes (Claude Code)
- Git-based workspaces: Worktrees or branches per agent
- Sub-agent coordination: Agents spawning and managing other agents
- PR-based output: Changes surfaced as pull requests for human review
- Extensibility: Plugins (Scion), Powers (Kiro), MCP/plugins/skills (Claude Code), agent-skills-standard (open standard)
Key Design Tensions#
- Autonomy vs. interaction: Kiro favors long-running independence; Scion says interaction is imperative; Claude Code offers configurable permission modes
- Opinionated vs. agnostic: Kiro is a specific agent product; Scion is infrastructure for any agent; Claude Code is a specific tool that Scion can orchestrate
- Isolation model: Scion uses containers +
--yolomode; Kiro uses sandboxes; Claude Code uses permission modes within the tool itself
Open-Source Multi-Agent Frameworks#
Seven frameworks represent different coordination philosophies (updated May 2026):
| Framework | Core Metaphor | Coordination | State | Best For |
|---|---|---|---|---|
| autogen-multi-agent | Conversation | Multi-turn dialogue | Conversation history | ⚠️ Legacy — migrate to MAF |
| crewai-multi-agent | Team of experts | Sequential/hierarchical process | Short/long/entity memory | Complex research tasks |
| langgraph-agent-orchestration | State machine | Graph edges + conditions | Checkpointed, persistent | Production workflows |
| openai-swarm | Handoffs | Function returns | Context variables (ephemeral) | ⚠️ Legacy — see Agents SDK |
| openai-agents-sdk | Handoffs + guardrails | Function returns + MCP | Session-based | OpenAI-model deployments |
| google-adk | Workflow agents | Sequential/Parallel/Loop + transfer | Session state + memory | Multi-language, model-agnostic |
| microsoft-agent-framework | Graph workflows | Typed nodes + edges | Agent state + session | Enterprise, Azure-centric |
Convergence signal: AutoGen (→ MAF), LangGraph, Google ADK (2.0), and Microsoft Agent Framework all converge on graph-based workflows. The protocol layer (MCP for tools, A2A for agents) is standardizing interoperability.
Open Questions#
- How should agents coordinate on shared state beyond git?
- What’s the right granularity for task decomposition?
- How to handle conflicting changes from concurrent agents?
- Does persistent context/memory (Kiro, Claude Code) outperform fresh-context-per-task?
- Will mcp-protocol become the interoperability layer between these tools?
- What design principles make skills effective? (See ten-pillars-agentic-skill-design for a proposed framework)