Multi-Agent Orchestration#

The practice of coordinating multiple LLM-based agents to work on tasks concurrently, with isolation, specialization, and collaboration.

Three Approaches Emerging#

Infrastructure-first: Scion#

scion positions itself as a “hypervisor for agents” — providing the infrastructure layer (containers, isolation, lifecycle management) while treating higher-level concerns as orthogonal. Harness-agnostic. Emphasizes human interaction as imperative.

Product-first: Kiro Autonomous Agent#

kiro’s autonomous agent is an opinionated product — a frontier-agent that handles the full stack from task intake to PR creation. Coordinates specialized sub-agents internally. Emphasizes autonomy and independence.

Tool-first: Claude Code#

claude-code approaches multi-agent from the individual tool outward — custom subagents, agent teams, and a rich extensibility stack (MCP, plugins, skills, hooks). Terminal-native, multi-platform. The agent itself can spawn and coordinate other agents.

Company-first: Paperclip#

paperclip operates above all three — orchestrating agents into companies with org charts, budgets, goals, governance, and accountability. Agent-agnostic (works with Claude Code, Codex, Cursor, etc.). Not an agent framework — it’s the organizational layer. “Manage business goals, not pull requests.”

Workspace-first: Gas Town#

gastown is a workspace manager that coordinates 20-30+ AI coding agents with persistent work tracking via git worktrees. A “Mayor” (Claude Code instance) orchestrates “Polecats” (worker agents) across project “Rigs.” Uniquely combines: git-worktree persistence, Bors-style merge queue, three-tier agent health monitoring, session discovery (Seance), and federated cross-workspace coordination (Wasteland). Agent-agnostic — supports Claude, Codex, Copilot, Gemini, Cursor, and others as runtimes.

Spec-first: Symphony#

symphony is OpenAI’s orchestration service published as a language-agnostic specification (78KB SPEC.md). Teams implement the protocol in their own language. Reads issues from Linear, creates per-issue isolated workspaces, runs Codex app-server sessions. WORKFLOW.md as single source of truth for prompt + config. Intentionally minimal: no database, no UI, scheduler/runner only. The coding agent handles all tracker mutations.

Platform-first: Multica#

multica treats coding agents as first-class teammates on a managed platform. Agents have profiles, show up on boards, post comments, report blockers, and compound reusable skills. Cloud-first (Next.js + Go + PostgreSQL) with self-hosting. Most vendor-neutral: supports 11 agent CLIs. Lighter governance than paperclip — Issues/Projects/Labels vs org charts/budgets. Thesis: two engineers + agent fleet = twenty.

Shared Patterns#

All three share:

Isolated execution: Containers (Scion) / sandboxes (Kiro) / permission modes (Claude Code)
Git-based workspaces: Worktrees or branches per agent
Sub-agent coordination: Agents spawning and managing other agents
PR-based output: Changes surfaced as pull requests for human review
Extensibility: Plugins (Scion), Powers (Kiro), MCP/plugins/skills (Claude Code), agent-skills-standard (open standard)

Key Design Tensions#

Autonomy vs. interaction: Kiro favors long-running independence; Scion says interaction is imperative; Claude Code offers configurable permission modes
Opinionated vs. agnostic: Kiro is a specific agent product; Scion is infrastructure for any agent; Claude Code is a specific tool that Scion can orchestrate
Isolation model: Scion uses containers + --yolo mode; Kiro uses sandboxes; Claude Code uses permission modes within the tool itself

Open-Source Multi-Agent Frameworks#

Seven frameworks represent different coordination philosophies (updated May 2026):

Framework	Core Metaphor	Coordination	State	Best For
autogen-multi-agent	Conversation	Multi-turn dialogue	Conversation history	⚠️ Legacy — migrate to MAF
crewai-multi-agent	Team of experts	Sequential/hierarchical process	Short/long/entity memory	Complex research tasks
langgraph-agent-orchestration	State machine	Graph edges + conditions	Checkpointed, persistent	Production workflows
openai-swarm	Handoffs	Function returns	Context variables (ephemeral)	⚠️ Legacy — see Agents SDK
openai-agents-sdk	Handoffs + guardrails	Function returns + MCP	Session-based	OpenAI-model deployments
google-adk	Workflow agents	Sequential/Parallel/Loop + transfer	Session state + memory	Multi-language, model-agnostic
microsoft-agent-framework	Graph workflows	Typed nodes + edges	Agent state + session	Enterprise, Azure-centric

Convergence signal: AutoGen (→ MAF), LangGraph, Google ADK (2.0), and Microsoft Agent Framework all converge on graph-based workflows. The protocol layer (MCP for tools, A2A for agents) is standardizing interoperability.

Open Questions#

How should agents coordinate on shared state beyond git?
What’s the right granularity for task decomposition?
How to handle conflicting changes from concurrent agents?
Does persistent context/memory (Kiro, Claude Code) outperform fresh-context-per-task?
Will mcp-protocol become the interoperability layer between these tools?
What design principles make skills effective? (See ten-pillars-agentic-skill-design for a proposed framework)