Cross-Source Theme Analysis#

42 sources, 12 tools, 4 OSS frameworks, 4 benchmarks, 2 standards, 3 methodologies, 4 memory systems, and sources covering cost, governance, UX, and industry impact. Here are the themes that appear across 3+ sources independently.

Refresh history: Originally written against 11 sources (Apr 9). Refreshed Apr 15 against 33 sources. Refreshed May 9 against 42 sources — added Gas Town, Symphony, Multica evidence to existing themes.


Theme 1: Context Is King (15/33 sources) ⭐⭐⭐⭐⭐#

The single most repeated idea. Now backed by quantitative evidence from memory research.

SourceHow it appears
agent-skills-standardProgressive disclosure: ~100 tokens at startup, full content only when activated
claude-codeMCP tools deferred, skills load on demand, subagent context isolation, CLAUDE.md under 200 lines
ten-pillars-agentic-skill-designPillar 9: four context management recipes
paiTELOS (10 files), three-tier memory, “context document” as core primitive
fabricPer-pattern model mapping, composable strategies
ai-technique-podcast“Context beats clever prompting.”
skills-pipeline-sleestkReference files loaded on demand, minimal structured context forward
llm-wiki-patternIndex-first navigation — read catalog, drill into relevant pages only
scionEach agent gets own container with own context. No shared pollution.
mem0-memory-managementFour memory layers. 93% token reduction with selective retrieval vs full-context.
continuum-memory-architecturesSix formal requirements for context persistence. CMA won 82/92 vs RAG.
efficient-memory-architecturesFour failure modes of flat vector storage. MemGPT: 90% token savings.
agent-cost-economicsFive waste vectors — 60-80% of tokens wasted on wrong context.
crewai-multi-agentRole + backstory as persona-based context management
langgraph-agent-orchestrationCheckpointed state as persistent context across workflow steps

Strengthened consensus: Not just “load the right thing” but now quantified: selective retrieval = 93% fewer tokens for ~5% accuracy tradeoff (mem0-memory-management). Context management is simultaneously a quality strategy AND a cost strategy (agent-cost-economics).


Theme 2: Composition Over Monoliths (17/42 sources) ⭐⭐⭐⭐⭐#

Validated across every framework — product-level and open-source alike.

SourceHow it appears
fabric251 focused patterns. Unix philosophy: pipe and compose.
skills-pipeline-sleestk6-stage YouTube pipeline. Each skill is one domain.
claude-codeSubagents (Explore, Plan, General-purpose). Skills per task.
scionHarness per tool. Template per agent. Grove per project.
agent-skills-standardOne skill per directory. Under 500 lines.
ten-pillars-agentic-skill-designPillar 3 (SRP), Pillar 4 (modularity).
pai63 skills, 21 hooks, 14 agents, 12 standalone packs.
kiroPowers as modular packages. Sub-agents for coordination.
autogen-multi-agentSpecialized agents communicate through dialogue
crewai-multi-agentCrew of role-specialized agents with task dependencies
langgraph-agent-orchestrationNodes as composable units in a graph
openai-swarmAgents as system prompts + functions. Minimal units.
paperclipAgents organized into companies with specialized roles
spec-kit30+ agents, 50+ extensions, each with focused scope
gastownSeven specialized roles (Mayor, Polecats, Refinery, Witness, Deacon, Dogs). Molecules as composable workflow units.
symphonyWORKFLOW.md per repo. Each issue gets isolated workspace + agent session. Separation of scheduler from agent.
multicaReusable skills that compound. Each agent is a focused teammate with a specific runtime.

Strongest convergence in the wiki. Every single multi-agent framework chose small, focused, composable units. No exceptions.


Theme 3: The Human Stays in the Loop — But How Much? (13/33 sources) ⭐⭐⭐⭐⭐#

Now formalized with measurable UX patterns and a phased adoption framework.

SourcePosition
scion“Interaction is imperative.”
kiroFrontier agents: hours/days of autonomy. PR-only output.
claude-code6 permission modes — configurable dial.
paiSelf-modifying, but human sets goals (TELOS).
evaluating-agent-skills-caparasHuman review is Tier 3 — expensive, use sparingly.
ai-technique-podcast“AI as thinking partner, not executor only.”
llm-wiki-patternHuman curates sources. LLM does everything else.
agentic-ux-patternsSix UX patterns with metrics: Intent Preview (>85% acceptance), Autonomy Dial (4 levels), Confidence Signal, Audit & Undo (<5% reversion), Escalation (>90% recovery).
agentic-ai-governanceKill switches, dynamic least privilege, continuous observability.
autogen-multi-agentConfigurable human participation in agent conversations.
crewai-multi-agentDelegation: agents can ask humans or other agents for help.
langgraph-agent-orchestrationHuman-in-the-loop at any node (first-class).
agentic-ai-non-code-domainsHealthcare demands human oversight; finance requires compliance gates.

New: The agentic-ux-patterns source formalizes this spectrum into six measurable patterns. The Autonomy Dial (Observe → Propose → Confirm → Autonomous) maps directly to Claude Code’s permission modes. Phased adoption: safety first → calibrated autonomy → proactive delegation.


Theme 4: Skills Are Evolving Into a Standard (6/33 sources) ⭐⭐⭐⭐#

Unchanged from original analysis. The evolution trajectory remains clear.

Fabric Patterns (2023) → Agent Skills Standard (2025) → Claude Code Skills (2026) → Pipelines + Evaluation. From simple prompt files to a full lifecycle.


Theme 5: Memory Is No Longer the Unsolved Frontier (10/33 sources) ⭐⭐⭐⭐⭐#

Upgraded from “unsolved” to “understood with clear tradeoffs.” Four new memory sources provide formal requirements, benchmarks, and architecture patterns.

SourceMemory approach
paiThree-tier hot/warm/cold. Self-modification. Most sophisticated product.
claude-codeCLAUDE.md + auto memory. Per working tree.
kiroPersistent context. Learns from code reviews.
llm-wiki-patternThe wiki IS the memory. File + Database pattern.
scionNo memory. Each agent starts fresh.
mem0-memory-managementGraph+vector, four layers, five scopes. LOCOMO benchmarks.
continuum-memory-architecturesSix formal CMA requirements. 82/92 wins vs RAG.
agent-memory-systems-2026Four patterns: vector-only, graph+vector, file+DB, hierarchical.
efficient-memory-architecturesH-MEM, MemGPT (90% savings), GraphRAG, selective forgetting.
crewai-multi-agentBuilt-in short/long/entity memory across agents.

Key advances: CMA defines six necessary conditions (RAG meets none). Mem0 provides production benchmarks (93% token reduction). Forgetting is now recognized as a design requirement, not a failure. See memory-architecture-comparison for the full analysis.


Theme 6: Git as Universal Substrate (9/42 sources) ⭐⭐⭐⭐⭐#

Upgraded from ⭐⭐⭐⭐ to ⭐⭐⭐⭐⭐. Gas Town and Symphony provide the strongest evidence yet that git is the coordination primitive for multi-agent systems.

SourceHow it appears
scionGit worktrees per agent
kiroGit branches, PR output
claude-codeGit-based workspaces
gastownGit worktrees for every polecat. Dolt (git-for-data) for beads. Merge queue (Refinery). Wasteland federation via DoltHub. Most git-native tool in the wiki.
symphonyPer-issue workspace directories. Workspaces persist across runs. WORKFLOW.md version-controlled with the codebase.
multicaGit-based workspace isolation per agent task
paperclipAgent-agnostic but assumes git-based code output
spec-kitSpec-driven development with git-versioned artifacts
llm-wiki-patternThe wiki itself is git-backed

Key advance (May 2026): Gas Town takes git further than any other tool — worktrees for isolation, Dolt for cell-level merge of concurrent agent writes, and a Bors-style merge queue for quality gates. Symphony uses git implicitly (workspaces are filesystem directories that can be git repos via hooks). The pattern is universal.


Theme 7: Evaluation Has a Framework Now (8/33 sources) ⭐⭐⭐⭐#

Upgraded from “weakest link” to “framework exists, adoption lags.”

SourceContribution
evaluating-agent-skills-caparasThree-tier framework (deterministic → LLM-judge → human)
ten-pillars-agentic-skill-designPillar 7. Acknowledged “no controlled study.”
anthropic-eval-guideSuccess criteria, eval types, design principles
promptfooOpen-source eval CLI, YAML test cases, CI/CD
humaneval-benchmarkCode generation: 164 problems, pass@k, 0% → 96.3%
swe-benchReal-world SE: 2,294 GitHub issues, top 74.4% resolved
gaia-benchmarkGeneral AI: 466 questions, humans 92% vs AI <50%
agentbenchAgent decision-making: 8 environments, multi-turn

Key advance: The benchmark landscape now covers code generation (solved at 96%), real-world SE (rapidly improving at 74%), general reasoning (far from human at <50%), and interactive agents (commercial » open-source). See agent-benchmarks for the full comparison. Still missing: skill-level eval, multi-agent coordination quality, memory quality benchmarks.


Theme 8: Open Standards Are Winning (5/33 sources) ⭐⭐⭐#

Unchanged. MCP + Agent Skills as two-layer open substrate.


NEW Theme 9: Graphs Are Becoming the Consensus Orchestration Architecture (4/33 sources) ⭐⭐⭐#

SourceEvidence
langgraph-agent-orchestrationBuilt on graphs from day one. Most production-ready OSS.
autogen-multi-agentTransitioning from GroupChat to graph-based MAF.
scionDirected workflows for agent coordination.
kiroSub-agents coordinated through structured task graphs.

Both AutoGen (via MAF) and LangGraph converging on typed nodes + edges. The conversation-based approach (AutoGen v0.2 GroupChat) is being abandoned by its own creators. See multi-agent-framework-guide.

May 2026 nuance: gastown proves graphs aren’t the only path to production scale. Gas Town’s process-model (deterministic routing via external state, GUPP pull-based execution) scales to 20-30 agents without graphs. The graph convergence applies to framework-level orchestration; workspace-level orchestration may use different primitives entirely.


NEW Theme 10: Token Economics Drive Architecture (5/33 sources) ⭐⭐⭐#

SourceEvidence
agent-cost-economics60-80% of tokens wasted. Five waste vectors. $5T infrastructure bet.
mem0-memory-management93% token reduction with selective retrieval.
efficient-memory-architecturesMemGPT: 90% token savings via OS-style paging.
context-managementProgressive disclosure, scoped instructions, deferred tools.
agent-memory-systems-2026Cost comparison across four memory patterns.

Cost optimization is not a billing concern — it’s an architectural concern. Every memory, context, and orchestration decision has a direct token cost implication. See cost-optimization-guide.


NEW Theme 11: Governance Is the Next Frontier (4/33 sources) ⭐⭐⭐#

SourceEvidence
agentic-ai-governanceFive pillars. Shadow AI ($412K/yr). NIST AI Agent Standards Initiative.
agentic-ux-patternsSix UX patterns as user-facing governance layer.
agentic-ai-non-code-domainsRegulated industries (healthcare, finance) demand governance for deployment.
agent-cost-economics68% of employees use AI without IT approval.

Legacy security models fail for agents (speed, identity, permissions all different). Regulatory landscape crystallizing: NIST, EU AI Act, OWASP, Singapore framework. See governance-safety-overview.


NEW Theme 12: Agentic AI Is Expanding Beyond Code (3/33 sources) ⭐⭐⭐#

SourceEvidence
agentic-ai-non-code-domainsSix industries: finance (40-60% compliance reduction), healthcare, legal, manufacturing, telecoms, transport.
agent-cost-economicsEnterprise ARPU $450-500/mo. SaaS disruption. $24.2B raised in 2025.
llm-wiki-patternApplies to research, reading, business — anywhere knowledge accumulates.

The wiki’s themes generalize across all industries. Main difference: non-code domains have higher stakes (healthcare hallucinations, legal liability). See beyond-code-industry-impact.


Theme Matrix (Updated May 2026)#

ThemeSourcesStrengthChange
Context is king15/42⭐⭐⭐⭐⭐→ Quantified (93% token reduction)
Composition over monoliths17/42⭐⭐⭐⭐⭐↑ Gas Town, Symphony, Multica all validate
Human in the loop (spectrum)13/42⭐⭐⭐⭐⭐→ Formalized as 6 UX patterns
Memory architectures10/42⭐⭐⭐⭐⭐→ “Understood”
Git as universal substrate9/42⭐⭐⭐⭐⭐↑↑ Gas Town is most git-native tool ever
Evaluation frameworks8/42⭐⭐⭐⭐→ “Framework exists”
Skills evolving into standard6/42⭐⭐⭐⭐→ Unchanged
Open standards winning5/42⭐⭐⭐→ Unchanged
Token economics drive architecture5/42⭐⭐⭐
Graph orchestration convergence4/42⭐⭐⭐~ Gas Town proves alternative path
Governance is next frontier4/42⭐⭐⭐
Expanding beyond code3/42⭐⭐⭐

See Also#