Key Insights: The Agentic AI Landscape (May 2026)#

Synthesized from 42 sources across this wiki. This analysis captures the patterns, tensions, and emerging consensus visible when you look across the entire landscape.

Refresh history: Originally written against 16 sources (Apr 8-10). Refreshed Apr 15 against 33 sources. Refreshed May 9 against 42 sources — added Gas Town, Symphony, Multica; updated layers, multi-agent philosophies, and “What’s Still Missing.”


1. Seven Layers Have Emerged (was Six)#

LayerRepresentativesCore Bet
CompanypaperclipOrg charts, budgets, governance, goal alignment
Methodologyspec-kit, bmad-methodSpecs, plans, tasks, quality gates
Orchestrationgastown, symphony, multicaWorkspace coordination, issue-to-agent automation, team collaboration
Infrastructurescion, langgraph-agent-orchestration, autogen-multi-agentContainers, runtimes, graph orchestration
Toolclaude-code, kiro, crewai-multi-agentAgentic loop, skills, hooks, MCP
Patternfabric, agent-skills-standard, openai-swarmCurated prompts, composable strategies, handoffs
Memorymem0, agent-memory-persistencePersistence, retrieval, forgetting, knowledge graphs

New (May 2026): Orchestration is now its own layer between Methodology and Infrastructure. Gas Town (workspace/process-model), Symphony (spec/scheduler), and Multica (platform/teammates) represent three distinct approaches at this layer. The emerging stack: Paperclip → Multica → Gas Town → LangGraph/MAF → Claude Code/Kiro → Scion → Mem0.

2. The Autonomy–Interaction Spectrum Is Formalized#

No longer just a spectrum — now has measurable UX patterns (agentic-ux-patterns):

← More Human Control                              More Agent Autonomy →

Observe &     Plan &       Act with        Act
Suggest       Propose      Confirmation    Autonomously
(Scion)       (CrewAI)     (Claude Code)   (Kiro, PAI)

The Autonomy Dial provides four levels with metrics: >85% acceptance rate, <5% reversion rate, >90% escalation recovery. Phased adoption: safety first → calibrated autonomy → proactive delegation.

3. Two Open Standards + Graph Convergence#

StandardWhat It DoesAdoption
mcp-protocolConnects agents to external tools/dataClaude Code, Kiro, Fabric
agent-skills-standardPackages reusable agent capabilitiesClaude Code, agentskills.io

New: Graph-based workflows are converging as the third standard — not a formal spec, but a consensus architecture. Both AutoGen (MAF) and LangGraph use typed nodes + edges. The conversation-based approach (GroupChat) is being abandoned by its own creators.

4. Progressive Disclosure Is Quantified#

The consensus solution to context management now has benchmarks:

ApproachTokensAccuracySource
Full-context (send everything)~26,00072.9%mem0-memory-management
Selective retrieval (Mem0)~1,80066.9%mem0-memory-management
Graph-enhanced (Mem0g)~1,80068.4%mem0-memory-management
MemGPT paging~1,000efficient-memory-architectures

93% fewer tokens for ~5% accuracy tradeoff. For any interactive agent, selective retrieval is the production-viable path.

5. Memory Is Understood (Was “Unsolved Frontier”)#

Four architecture patterns documented (memory-architecture-comparison):

  • Vector-Only: fast, no relationships (1/6 CMA compliance)
  • Graph+Vector (mem0): relationships + semantic search (4/6 CMA)
  • File+Database (llm-wiki-pattern): human-readable, git-friendly (2/6 CMA)
  • Hierarchical (MemGPT, pai): mimics human cognition (5/6 CMA)

Six formal CMA requirements defined (continuum-memory-architectures). Standard RAG meets none. Forgetting is a design requirement, not a failure. Progression: start vector-only → add graph → add hierarchy → add forgetting.

6. Git Remains the Universal Coordination Mechanism#

Unchanged. Every tool uses git. No one is building a custom protocol. But git only works for text-shaped artifacts.

7. The Skill Hierarchy Is Crystallizing#

Unchanged. Fabric Patterns → Agent Skills Standard → Claude Code Skills → PAI Skills. PAI’s insight: CODE → CLI → PROMPT → SKILL (code before prompts).

8. Security Models Now Have a Governance Layer#

Expanded from tool-level security to organizational governance (agentic-ai-governance):

LayerApproachSource
InfrastructureContainer isolationscion
AgentPermission modes + classifierclaude-code
OutputSandbox + PR-onlykiro
PolicyHooks + allowlistspai
OrganizationFive pillars: inventory, identity, least privilege, observability, complianceagentic-ai-governance
User-facingSix UX patterns: intent preview, autonomy dial, rationale, confidence, audit, escalationagentic-ux-patterns

Shadow AI: 68% of employees use AI without IT approval. $412K/yr average cost. Regulatory landscape crystallizing (NIST, EU AI Act, OWASP, Singapore).

9. The “Personal AI” Vision Extends to Every Industry#

Expanded from coding to six industries (agentic-ai-non-code-domains):

IndustryImpactKey Tension
Financial Services40-60% compliance time reductionRegulatory compliance
HealthcareEnormous potentialHallucination = life-threatening
Professional ServicesExistential SaaS disruptionHourly-billing model at risk
ManufacturingAdaptive vs rigid automationPhysical-digital convergence
Telecoms8-15% cost reductionsUnderutilized data
Transportation12-20% delivery improvementsClear quantifiable ROI

The wiki’s themes generalize across all industries. Stakes are higher outside code. Domain expertise is the moat.

Upgraded with four standardized benchmarks (agent-benchmarks):

BenchmarkWhat It TestsTop ScoreHuman Baseline
humaneval-benchmarkCode generation96.3%
swe-benchReal-world SE74.4%
gaia-benchmarkGeneral AI assistant<50%92%
agentbenchInteractive agents (8 envs)Commercial » OSS

Progression: code gen (solved) → real-world SE (improving fast) → general reasoning (far from human) → interactive agents (commercial leads). Still missing: skill-level eval, multi-agent quality, memory quality benchmarks.

11. Token Economics Are an Architectural Concern (NEW)#

60-80% of agent tokens are waste (agent-cost-economics). Five waste vectors: file reading loops, retry tax, over-qualified models, no caching, context contamination.

$5T infrastructure bet with base case 3.2% ROI. Per-token costs falling 85% but total cost flat/increasing due to volume. Reasoning models use 8× more tokens. The industry’s viability depends on enterprise adoption at $450-500/mo ARPU.

Optimization is architecture: model routing (5-8× savings), prompt caching (90% discount), session discipline, selective retrieval (93% reduction). See cost-optimization-guide.

12. Multi-Agent Has Seven Philosophies Now (was Four) (NEW → EXPANDED)#

Frameworks (composable, bring-your-own-model):

FrameworkPhilosophyBest For
autogen-multi-agentConversationResearch, prototyping
crewai-multi-agentRole-based teamsComplex research
langgraph-agent-orchestrationState machine graphsProduction workflows
openai-swarmMinimal handoffsSimple routing, learning

Orchestration tools (workspace/platform-level):

ToolPhilosophyBest For
gastownProcess-model (GUPP)20-30 parallel agents, crash-surviving state
symphonySpec/protocol (WORKFLOW.md)Issue-tracker-driven automation, minimal infra
multicaPlatform (agents as teammates)Team collaboration, compounding skills

Key architectural split: Conversation-as-control (3-5 agents) vs process-model (20-30 agents). Gas Town is the only tool using deterministic routing via external state rather than LLM conversation for coordination. See orchestration-tools-compared and multi-agent-framework-guide.

13. Governance Requires Five Pillars (NEW)#

From agentic-ai-governance: Agent Inventory → Agent Identity (NHI) → Dynamic Least Privilege → Continuous Observability → Continuous Compliance. Legacy security fails because agents violate every assumption (identity, permissions, behavior, speed, audit trail). Kill switches are non-negotiable. See governance-safety-overview.

14. What’s Still Missing#

Gaps visible across 42 sources:

  • Skill-level evaluation: benchmarks test whole models, not individual skills
  • Multi-agent coordination quality: no benchmark for how well agents work together
  • Memory quality benchmarks: LOCOMO is closest, but no standard for long-term memory accuracy
  • Conflict resolution: when agents or memories contradict, no standard mechanism
  • Cross-framework interoperability: MCP connects tools, but no standard for agent-to-agent handoff across frameworks
  • Environmental impact: $5T infrastructure has energy implications (partially addressed by ai-environmental-impact)
  • Harness engineering practices: Symphony/OpenAI coined the term but no comprehensive guide exists
  • Cross-model adversarial review: Metaswarm pattern (writer ≠ reviewer) is promising but not yet in wiki as a source
  • Content pipeline: turning wiki/agent knowledge into publishable content (the user’s stated goal)

Analysis based on 42 sources ingested between 2026-04-07 and 2026-05-09. Refreshed 2026-05-09.

See Also#