LLM Wiki — Agentic AI Landscape#

A persistent, compounding knowledge base about the agentic AI ecosystem, built and maintained by an LLM following the llm-wiki-pattern proposed by andrej-karpathy.

What This Wiki Is#

Instead of re-deriving knowledge from scratch on every question (like RAG), this wiki incrementally compiles and maintains a structured, interlinked collection of markdown files. Every source ingested updates entity pages, concept pages, cross-references, and synthesis — so the knowledge compounds over time.

The human curates sources, directs analysis, and asks questions. The LLM does everything else — summarizing, cross-referencing, filing, and bookkeeping.

What’s Inside#

39 sources across tools, standards, methodologies, evaluation, memory, multi-agent frameworks, economics, governance, industry analysis, and production guides:

Tools: scion (GCP), kiro (AWS), claude-code (Anthropic), fabric (Miessler), pai (Miessler), paperclip (company-level orchestration), promptfoo (eval tooling), notebooklm (Google Labs), mem0 (memory management)

Methodologies: spec-kit (GitHub), bmad-method (agile AI-driven), ten-pillars-agentic-skill-design (Forster)

Standards: agent-skills-standard (agentskills.io), mcp-protocol (Model Context Protocol)

Evaluation & Benchmarks: anthropic-eval-guide, evaluating-agent-skills-caparas, promptfoo, humaneval-benchmark (code gen, 96.3%), swe-bench (real-world SE, 74.4%), gaia-benchmark (general AI, humans 92% vs AI <50%), agentbench (8 interactive environments)

Memory & Persistence: mem0-memory-management (LOCOMO benchmarks), continuum-memory-architectures (CMA formal requirements), agent-memory-systems-2026 (four patterns), efficient-memory-architectures (H-MEM, MemGPT, GraphRAG), memory-lifecycle-drift (decay, contradiction, confidence, compression, expiry), langgraph-mem0-integration (LangGraph + Mem0 tutorial), shared-agent-memory (multi-agent shared memory patterns)

Multi-Agent Frameworks: autogen-multi-agent (Microsoft, 56.8K stars), crewai-multi-agent (CrewAI, role-based teams), langgraph-agent-orchestration (LangGraph, stateful graphs), openai-swarm (Swarm, minimal handoffs), crewai-production-guide (production patterns and deployment)

Observability: multi-agent-observability (OpenTelemetry tracing, debugging patterns)

Economics, Governance & Industry: agent-cost-economics ($5T bet, token optimization), agentic-ai-governance (five pillars, NIST/EU AI Act), agentic-ai-non-code-domains (6 industries), agentic-ux-patterns (6 UX patterns for trust), ai-environmental-impact (energy/carbon/water benchmarks, Jevons Paradox)

Practitioner Insights: ai-technique-podcast, skills-pipeline-sleestk

The Emerging Stack#

Six distinct layers have emerged across 39 sources:

LayerRepresentativesFocus
CompanypaperclipOrg charts, budgets, governance, goal alignment
Methodologyspec-kit, bmad-methodSpecs, plans, tasks, quality gates, agile workflows
Infrastructurescion, langgraph-agent-orchestration, autogen-multi-agentContainers, runtimes, graph orchestration, state machines
Toolclaude-code, kiro, crewai-multi-agentAgentic loop, skills, hooks, MCP, permissions
Patternfabric, agent-skills-standard, openai-swarmCurated prompts, composable strategies, handoff patterns
Memorymem0, agent-memory-persistencePersistence, retrieval, forgetting, knowledge graphs

Key Themes#

Across 39 sources, twelve themes have been identified (see cross-source-themes for full evidence tables):

Strongest themes (10+ sources):

  • Context beats clever prompting (15/33) — 93% token reduction with selective retrieval (mem0-memory-management)
  • Composition over monoliths (14/33) — validated across all OSS frameworks and product tools
  • The human stays in the loop — but how much? (13/33) — formalized as six UX patterns with metrics (agentic-ux-patterns)
  • Memory architectures are understood (10/33) — CMA defines six formal requirements, four patterns documented, forgetting is a design requirement (agent-memory-persistence)

Established themes (5-9 sources):

  • Evaluation has a framework (8/33) — from code gen (96.3%) to general reasoning (humans 92% vs AI <50%) (agent-benchmarks)
  • Skills evolving into a standard (6/33) — Fabric Patterns → Agent Skills Standard → Claude Code Skills
  • Git as universal substrate (6/33) — every tool uses git for coordination
  • Open standards winning (5/33) — MCP + Agent Skills as two-layer open substrate
  • Token economics drive architecture (5/33) — 60-80% of tokens are waste (agent-cost-economics)

Emerging themes (3-4 sources):

  • Graphs are the consensus orchestration architecture (4/33) — AutoGen and LangGraph converging
  • Governance is the next frontier (4/33) — 68% use AI without IT approval (agentic-ai-governance)
  • Agentic AI expanding beyond code (3/33) — six industries, SaaS disruption (agentic-ai-non-code-domains)

Analysis#

11 synthesized analyses filed back into the wiki:

AnalysisFocus
key-insights-agentic-landscape14 key insights across the landscape (refreshed for 33 sources)
cross-source-themes12 themes with evidence tables (refreshed for 33 sources)
ten-pillars-evidence-mapHow the wiki validates the Ten Pillars framework
how-to-eval-a-skillPractical eval guide: 5 surfaces, 3 tiers, pass@k
memory-architecture-comparisonFour memory patterns compared, CMA requirements, benchmarks, decision framework
multi-agent-framework-guideEight approaches compared (4 OSS + 4 product), graph convergence thesis
cost-optimization-guideFive waste vectors, optimization playbook, architecture-cost connections
governance-safety-overviewFive governance pillars, regulatory landscape, UX as governance layer
beyond-code-industry-impactSix industries, wiki themes generalization test, SaaS disruption
wiki-gap-analysis-apr-202620 gaps identified via conversational query testing, 15 addressed
getting-started-guideFive-phase guide from single agent to multi-agent, cost budgets

How It Works#

Three operations:

  • Ingest: Drop a source → LLM processes it → creates/updates wiki pages → updates index and log
  • Query: Ask a question → LLM reads index, synthesizes answer → optionally files back as analysis
  • Lint: Health-check for contradictions, orphan pages, stale claims, missing cross-references

Browse#

  • Sources — 39 raw sources that feed this wiki
  • Concepts — 20 concept pages covering patterns, standards, and architectural ideas
  • Entities — 17 pages for tools, people, and organizations
  • Analysis — 12 synthesized analyses filed back into the wiki