Multi-Agent Coordination — From Solo to Team

Lesson 9 · Safe Agentic Workflows · ~12 minutes

Every lesson so far has focused on a single agent doing a job: one workflow, one prompt, one output. But some tasks are too big for one agent's context window, or too important to trust without independent review. That's where multi-agent coordination comes in.

The good news: if you've been using /comply to spawn an independent reviewer subagent, you're already doing multi-agent coordination. This lesson formalizes those patterns and shows you how to scale them.

When You Need Multiple Agents

A single agent hits its limits when:

Context overflow — the task requires reading more files, docs, or history than one context window can hold
Independent review — the implementer can't objectively evaluate its own output (same reason humans do code review)
Specialization — different parts of the task require different expertise (security vs. performance vs. UX)
Parallelism — subtasks are independent and can run concurrently to save time

The key principle: coordination cost must be less than the value of specialization. Don't split a 5-minute task across 3 agents just because you can.

SAW's Approach: 11 Roles with Defined Boundaries

The Safe Agentic Workflow (SAW) framework defines 11 agent roles, each with explicit boundaries, exit states, and handoff patterns. You don't need all 11 — most solo developers use 3-4 — but knowing the full set helps you pick the right ones.

Role	Responsibility	Exit State
Planner	Breaks work into tasks, defines acceptance criteria	Plan document with task list
Researcher	Gathers context, reads docs, explores codebase	Context summary for downstream agents
Implementer	Writes code, creates files, makes changes	Working code with passing tests
Reviewer	Independent quality review of implementation	Approve or reject with specific feedback
QAS (Quality Assurance & Security)	Validates security, compliance, and quality gates	Pass/fail with evidence
Shipper	Deploys, merges, or publishes approved work	Artifact in production/main
Triage	Categorizes incoming work, assigns priority	Labeled and assigned issue
Monitor	Watches for regressions, drift, or anomalies	Alert or noop
Documenter	Updates docs, changelogs, and knowledge bases	Updated documentation
Orchestrator	Dispatches and coordinates other agents	All subtasks complete
Responder	Handles external queries (issues, discussions)	Response posted

The QAS Gate Pattern

The most critical multi-agent pattern is the QAS gate: an independent quality and security review that sits between implementation and shipping. It must be a separate agent (or subagent) because:

Independence — the implementer is biased toward its own output. A fresh context catches what the implementer missed.
Clean context — the reviewer sees only the output, not the messy exploration that produced it. This mimics how a human reviewer reads a PR.
Accountability — if the reviewer approves something broken, you know exactly where the gate failed.

💡 You're already doing this

When /comply spawns a reviewer subagent, that's a QAS gate. The subagent has fresh context, reviews independently, and produces a pass/fail verdict. The pattern is the same whether it's checking code style or security posture.

gh-aw's Approach: OrchestratorOps + Dispatch Workflow

GitHub Agentic Workflows implements multi-agent coordination through OrchestratorOps — a workflow that doesn't do work itself, but dispatches specialized worker workflows and coordinates their results.

The pattern is simple: one orchestrator workflow receives a trigger, decides what needs to happen, and fires dispatch-workflow calls to specialized workers. Each worker runs in its own context with its own permissions and budget.

---
description: "Orchestrator — routes incoming issues to the right workflow"

on:
  issues:
    types: [opened, labeled]

permissions:
  contents: read
  issues: write

engine:
  id: copilot
  model: gpt-4.1

safe-outputs:
  dispatch-workflow:
    allowed:
      - research-and-plan.md
      - implement-feature.md
      - security-review.md
---

# Route this issue

Based on the issue labels and content, dispatch to the appropriate
worker workflow:

- `bug` label → implement-feature.md (with fix instructions)
- `feature` label → research-and-plan.md (needs planning first)
- `security` label → security-review.md (needs audit)

Pass the issue number and relevant context to the dispatched workflow.

The Research-Plan-Assign Pattern

The most common orchestration pattern chains three phases: research the problem, plan the solution, then assign the implementation. Each phase can be a separate agent with appropriate context and permissions.

This is what SAW calls ResearchPlanAssignOps — a reusable orchestration pattern. The orchestrator doesn't need to understand the domain; it just routes context between specialists.

Extending Your Existing Pattern

You already have multi-agent coordination via /comply. Here's how to extend it:

Extension 1: Parallel Specialized Reviewers

Instead of one reviewer subagent, spawn two in parallel — one for code quality and one for security:

# In your /comply workflow, dispatch two reviewers:

safe-outputs:
  dispatch-workflow:
    allowed:
      - code-review.md      # Style, DRY, correctness
      - security-review.md  # Auth, injection, secrets

# Both run independently. Both must pass before shipping.

This gives you the same pattern as a team with separate code review and security review — without needing two humans.

Extension 2: Chained Workflows (Triage → Implement → Review)

Use dispatch-workflow to chain a full pipeline:

# orchestrator.md dispatches sequentially:
#   1. triage-issue.md    → labels, prioritizes, writes acceptance criteria
#   2. implement.md       → coding agent picks up the issue
#   3. review.md          → independent QAS gate on the PR

# Each workflow's output triggers the next via issue/PR events

Extension 3: Assign-to-Agent

The orchestrator can assign issues directly to the Copilot coding agent:

# After research + planning, assign the issue:
safe-outputs:
  assign-issue:
    assignees: ["copilot"]  # GitHub's coding agent picks it up
    labels: ["agent-ready", "auto-implement"]

The coding agent sees the issue with full context (from the research phase) and acceptance criteria (from the planning phase), then opens a PR.

Role Collapsing

Eleven roles is a lot for a solo developer. The SAW framework explicitly supports role collapsing — combining multiple roles into fewer agents when independence isn't critical.

✅ Can Collapse

Researcher + Planner → one "prep" agent
Implementer + Documenter → code + docs together
Triage + Responder → one "intake" agent
Monitor + Responder → detect + notify
Orchestrator + Planner → plan + dispatch

🚫 Never Collapse

QAS + Implementer (reviews own work)
Security + Implementer (audits own code)
Reviewer + Implementer (no independence)
Shipper + Implementer (no gate before deploy)

The rule is simple: any role that exists to provide independent verification cannot be collapsed with the role it verifies. Everything else is fair game for solo work.

In practice, most solo developers use three active agents:

Implementer (collapsed with Researcher + Planner + Documenter) — does the work
Reviewer/QAS (independent subagent) — validates the work
Shipper (you, the human) — approves and merges

Dark Factory: Persistent Agent Teams (Experimental)

⚠️ Experimental Pattern

Dark Factories are an emerging concept — persistent autonomous agent teams running on remote servers. This is the bleeding edge of multi-agent coordination. Understand the concept, but don't build one until the tooling matures.

A "Dark Factory" is a tmux-based setup where multiple agent sessions run persistently on a remote server, each in its own pane:

Pane 1: Monitor agent — watches for new issues and PRs
Pane 2: Implementer agent — picks up assigned work
Pane 3: Reviewer agent — reviews completed PRs
Pane 4: Orchestrator — coordinates handoffs between panes

They communicate through the repository itself (issues, PRs, comments) — the same mechanism humans use. This means every interaction is auditable and the system degrades gracefully (a human can step in at any point).

The key insight: the coordination protocol is the same whether agents are ephemeral (workflow-triggered) or persistent (tmux-based). Issues and PRs are the message bus. Labels and assignments are the routing mechanism.

Connection to Your Work

Here's where you are on the multi-agent spectrum:

Level	Pattern	You?
1. Single agent	One prompt, one output	✅ Most of your work
2. Agent + subagent	Implementer spawns independent reviewer	✅ Your `/comply` pattern
3. Orchestrated chain	Workflow dispatches multiple specialists	⬜ Next step
4. Persistent team	Dark Factory — always-on agent ensemble	⬜ Future (experimental)

Your next step is Level 3: add a second specialized reviewer (security) alongside your existing code reviewer, or set up a dispatch-workflow that chains triage → implement → review automatically when issues are labeled.

💡 Start small

Don't jump to full orchestration. Add one more specialized subagent to your existing /comply pattern — a security reviewer that checks for secrets, injection vectors, and auth issues. That alone gives you 80% of the value of multi-agent coordination.

Check Your Understanding

Why must the reviewer be a subagent (separate context) rather than a follow-up prompt in the same session?

Subagents are faster because they run in parallel
The main agent's context window is usually full after implementation
A fresh context provides independence — the reviewer isn't biased by the implementation journey and sees only the output
Subagents have access to different tools than the main agent

Correct — independence is the key reason. A subagent has no memory of the exploratory dead-ends, compromises, or reasoning that led to the implementation. It evaluates the output on its own merits, just like a human reviewer reading a PR without watching it being written.

The primary reason is independence. A subagent has fresh context — it doesn't carry the implementer's biases, dead-ends, or justifications. It sees only the output, which lets it catch issues the implementer rationalized away. This is the same reason human code review works: the reviewer wasn't there when shortcuts were taken.

What does OrchestratorOps do in the gh-aw multi-agent model?

It implements features by writing code across multiple files
It monitors all running workflows and restarts failed ones
It receives triggers and dispatches specialized worker workflows, coordinating their execution without doing the work itself
It merges PRs automatically once all checks pass

Right — OrchestratorOps is a coordinator, not a worker. It receives a trigger (like a new issue), decides which specialist workflow should handle it, and dispatches that workflow with the right context. It's the routing layer between events and specialized agents.

OrchestratorOps is purely a coordinator — it doesn't do work itself. It receives triggers (new issue, PR event, schedule), decides which specialized workflow should handle the task, and dispatches it via dispatch-workflow. Think of it as a dispatcher routing calls to the right department.

What does "role collapsing" mean, and which roles can never be collapsed together?

Removing unused roles from the framework; any role can be removed if not needed
Combining multiple roles into fewer agents for efficiency; QAS/Reviewer and Implementer can never be collapsed because independence is required for verification
Reducing the number of workflow files by putting multiple prompts in one file
Deprecating old roles as the framework evolves; Orchestrator and Monitor are being collapsed in vNext

Exactly — role collapsing lets solo developers run lean by combining compatible roles (Researcher + Planner, Implementer + Documenter). But any role that verifies another role's output (QAS, Reviewer, Security) must remain independent. You can't review your own work.

Role collapsing means combining multiple SAW roles into fewer agents when full separation isn't needed — for example, one agent doing research + planning + implementation. The hard rule: any role that exists to verify another role's output (QAS, Reviewer, Security reviewer) can never be collapsed with the role it checks. Independence is non-negotiable for quality gates.

Key Takeaways

Multi-agent isn't exotic — spawning a reviewer subagent (which you already do) is multi-agent coordination
OrchestratorOps dispatches, doesn't implement — it routes triggers to specialized workers
Research → Plan → Assign is the standard orchestration chain for complex work
Role collapsing keeps things lean for solo work, but QAS/Security independence is sacred
The repository is the message bus — issues, PRs, and labels coordinate agents the same way they coordinate humans

Primary Source

Safe Agentic Workflow — vNext Workflow Contract — the full specification of agent roles, exit states, handoff patterns, and role collapsing rules.

Questions? Ask me about role boundaries, when to add another subagent, how to set up dispatch-workflow chaining, or how the Dark Factory concept applies to your projects.

← Back Next →