Integrating with CI

Lesson 9 · Learn Graphify · ~10 minutes

Keep your graph current and detect architectural drift automatically. This lesson shows how to run Graphify in CI pipelines — rebuilding the graph on merge, detecting boundary violations, and reporting changes in PRs.

Win

After this lesson, your graph stays fresh without manual intervention's, and PRs that break architectural boundaries get flagged automatically.

Why CI Integration?

Local hooks (Lesson 5) work for individual developers. CI integration solves team-level problems:

Freshness: The committed graph is always up-to-date with main
Drift detection: PRs that introduce circular dependencies get flagged
Benchmark tracking: Token reduction ratio tracked over time
Team onboarding: New developers clone the repo with a ready-to-use graph

Strategy 1: Rebuild on Merge

The simplest approach — rebuild and commit the graph whenever main changes:

# .github/workflows/update-graph.yml
name: Update Knowledge Graph

on:
  push:
    branches: [main]
    paths-ignore:
      - 'graphify-out/**'

jobs:
  update-graph:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Graphify
        run: pip install graphify-cli

      - name: Build graph
        run: graphify . --no-viz
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

      - name: Commit updated graph
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git add graphify-out/
          git diff --cached --quiet && exit 0
          git commit -m "Update knowledge graph [skip ci]"
          git push

Note

Use --no-viz in CI to skip the HTML visualization (saves time, reduces commit noise). Use --no-cluster if you don't have an LLM API key in CI — you'll get raw extraction without named communities.

Strategy 2: Drift Detection on PRs

Run the graph on the PR branch and compare against main to detect structural changes:

# .github/workflows/graph-check.yml
name: Graph Drift Check

on:
  pull_request:
    branches: [main]

jobs:
  check-drift:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Graphify
        run: pip install graphify-cli

      - name: Build PR graph
        run: graphify . --no-viz --no-cluster

      - name: Compare with main graph
        run: |
          # Download main's graph for comparison
          git fetch origin main
          git show origin/main:graphify-out/graph.json > /tmp/main-graph.json 2>/dev/null || exit 0

          # Count node/edge changes
          PR_NODES=$(jq '.nodes | length' graphify-out/graph.json)
          MAIN_NODES=$(jq '.nodes | length' /tmp/main-graph.json)
          echo "Nodes: $MAIN_NODES → $PR_NODES (Δ $((PR_NODES - MAIN_NODES)))"

      - name: Run benchmark
        run: graphify benchmark | tee /tmp/benchmark.txt

      - name: Comment on PR
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const benchmark = fs.readFileSync('/tmp/benchmark.txt', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `### 📊 Graph Impact\n\`\`\`\n${benchmark}\n\`\`\``
            });

Strategy 3: Boundary Enforcement

Define architectural rules and fail the build when they're violated:

# .graphify-rules.yml (hypothetical — implement with jq or a script)
boundaries:
  - name: "Tests don't import from other test directories"
    rule: "no edge from tests/operators/min/* to tests/operators/max/*"

  - name: "Framework doesn't depend on tests"
    rule: "no edge from framework/* to tests/*"

  - name: "No circular imports"
    rule: "no cycles in import edges"

Until Graphify has native rule support, you can implement basic checks with jq:

# Check for backwards dependencies: framework importing from tests
VIOLATIONS=$(jq '[.edges[] | select(.source | startswith("framework/"))
  | select(.target | startswith("tests/"))] | length' graphify-out/graph.json)

if [ "$VIOLATIONS" -gt 0 ]; then
  echo "❌ Found $VIOLATIONS boundary violations: framework imports from tests"
  exit 1
fi

Cost Consideration

Running with clustering (LLM calls) in CI costs money per run. Use --no-cluster for drift detection and boundary checks — you only need the raw graph structure for those. Save clustering for the scheduled rebuild on main.

Committing the Graph or Not

Approach	Pros	Cons
Commit `graphify-out/`	Always available; clone and go	Adds to repo size; merge conflicts
Generate on demand	Clean repo; no conflicts	Requires API key locally; slower start
CI artifact	Available for download; not in tree	Extra step to fetch before sessions

For most teams, committing the graph is best — it means git clone gives you everything needed to start an AI-assisted session immediately.

Verify Your Understanding

Why use --no-cluster in CI drift detection workflows?

Clustering is not supported on Linux runners
Drift detection only needs raw structure, and clustering costs LLM API calls
Clustering takes hours and would time out the workflow
The cluster names change every run, creating false positives

What's the main advantage of committing graphify-out/ to the repo?

It makes the graph immutable and prevents tampering
GitHub renders graph.json as a searchable file by default
Clone-and-go — anyone can start an AI session immediately without building
It triggers GitHub's code intelligence features on the graph data

Do This Now

Decide: commit the graph or generate on demand?
Add one of the workflow files above to your project's .github/workflows/
If committing: add graphify-out/ to version control and push
Open a test PR and verify the workflow runs