The Benchmark Loop

Lesson 4 · Learn Graphify · ~8 minutes

You've built a graph and captured your baseline. Now measure the theoretical maximum savings — and learn how to use benchmarking as a feedback loop for graph quality.

Win

After this lesson, you can run graphify benchmark, interpret the output, and use it to decide whether your graph needs tuning.

What the Benchmark Measures

The benchmark command compares two approaches:

  1. Naive approach: Read every source file into context (total token count of your codebase)
  2. Graph approach: Read graph.json into context (token count of the graph)

The ratio between these is your theoretical token reduction. It represents the maximum possible savings if the agent never needs to read a source file at all.

graphify benchmark

Reading the Output

# Example output:
# Source corpus: 1,247,832 tokens (423 files)
# Graph size:      18,442 tokens
# Reduction: 67.7x
# 
# Largest communities by token weight:
#   Test Framework Assertions: 4,211 tokens (22.8%)
#   MongoDB Connection Pool:   3,102 tokens (16.8%)
#   Parametrize Helpers:       2,889 tokens (15.7%)
FieldWhat it tells you
Source corpusWhat the agent would consume without a graph
Graph sizeWhat the agent actually consumes with graph context
ReductionTheoretical max savings (your ceiling)
Community weightsWhere the graph spends its tokens — heavily-connected areas cost more

Theoretical vs. Actual

Your actual savings will be lower than the benchmark number. The agent still reads some source files after consulting the graph — it just reads fewer of them, and the right ones.

Rules of thumb:

Note

If your benchmark is under 5x, your project might be small enough that Graphify isn't worth the overhead. That's fine — it's a tool for scale.

Using Benchmarks as a Feedback Loop

Run graphify benchmark after each significant change to your graph config:

  1. After initial build — record your baseline ratio
  2. After excluding files (tests, generated code) — did the ratio improve?
  3. After a major refactor — did the graph grow proportionally or blow up?
  4. After switching clustering backends — did community quality change?

If the ratio drops significantly after a code change, the graph may be bloated. Consider excluding generated files or using --no-cluster for a lean extraction.

Excluding Files for Better Ratios

Large codebases often have files that add noise without value:

# Create a .graphifyignore file (same syntax as .gitignore)
echo "*.generated.py
docs/
__pycache__/
*.pyc" > .graphifyignore

# Rebuild
graphify .

Re-run graphify benchmark after excluding — you want a graph that's dense with signal, not padded with generated code.

Verify Your Understanding

If graphify benchmark reports a 45x reduction, what actual savings should you expect?

When should you not bother with Graphify?

Do This Now

  1. Run graphify benchmark on your project
  2. Record the reduction ratio alongside your Lesson 2 baseline numbers
  3. If the ratio is under 10x, consider whether the project needs a graph at all
  4. Try adding a .graphifyignore and re-benchmarking — see if the ratio improves