Integrating with CI

Lesson 9 · Learn Graphify · ~10 minutes

Keep your graph current and detect architectural drift automatically. This lesson shows how to run Graphify in CI pipelines — rebuilding the graph on merge, detecting boundary violations, and reporting changes in PRs.

Win

After this lesson, your graph stays fresh without manual intervention's, and PRs that break architectural boundaries get flagged automatically.

Why CI Integration?

Local hooks (Lesson 5) work for individual developers. CI integration solves team-level problems:

Strategy 1: Rebuild on Merge

The simplest approach — rebuild and commit the graph whenever main changes:

# .github/workflows/update-graph.yml
name: Update Knowledge Graph

on:
  push:
    branches: [main]
    paths-ignore:
      - 'graphify-out/**'

jobs:
  update-graph:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Graphify
        run: pip install graphify-cli

      - name: Build graph
        run: graphify . --no-viz
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

      - name: Commit updated graph
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git add graphify-out/
          git diff --cached --quiet && exit 0
          git commit -m "Update knowledge graph [skip ci]"
          git push
Note

Use --no-viz in CI to skip the HTML visualization (saves time, reduces commit noise). Use --no-cluster if you don't have an LLM API key in CI — you'll get raw extraction without named communities.

Strategy 2: Drift Detection on PRs

Run the graph on the PR branch and compare against main to detect structural changes:

# .github/workflows/graph-check.yml
name: Graph Drift Check

on:
  pull_request:
    branches: [main]

jobs:
  check-drift:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Graphify
        run: pip install graphify-cli

      - name: Build PR graph
        run: graphify . --no-viz --no-cluster

      - name: Compare with main graph
        run: |
          # Download main's graph for comparison
          git fetch origin main
          git show origin/main:graphify-out/graph.json > /tmp/main-graph.json 2>/dev/null || exit 0

          # Count node/edge changes
          PR_NODES=$(jq '.nodes | length' graphify-out/graph.json)
          MAIN_NODES=$(jq '.nodes | length' /tmp/main-graph.json)
          echo "Nodes: $MAIN_NODES → $PR_NODES (Δ $((PR_NODES - MAIN_NODES)))"

      - name: Run benchmark
        run: graphify benchmark | tee /tmp/benchmark.txt

      - name: Comment on PR
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const benchmark = fs.readFileSync('/tmp/benchmark.txt', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `### 📊 Graph Impact\n\`\`\`\n${benchmark}\n\`\`\``
            });

Strategy 3: Boundary Enforcement

Define architectural rules and fail the build when they're violated:

# .graphify-rules.yml (hypothetical — implement with jq or a script)
boundaries:
  - name: "Tests don't import from other test directories"
    rule: "no edge from tests/operators/min/* to tests/operators/max/*"

  - name: "Framework doesn't depend on tests"
    rule: "no edge from framework/* to tests/*"

  - name: "No circular imports"
    rule: "no cycles in import edges"

Until Graphify has native rule support, you can implement basic checks with jq:

# Check for backwards dependencies: framework importing from tests
VIOLATIONS=$(jq '[.edges[] | select(.source | startswith("framework/"))
  | select(.target | startswith("tests/"))] | length' graphify-out/graph.json)

if [ "$VIOLATIONS" -gt 0 ]; then
  echo "❌ Found $VIOLATIONS boundary violations: framework imports from tests"
  exit 1
fi
Cost Consideration

Running with clustering (LLM calls) in CI costs money per run. Use --no-cluster for drift detection and boundary checks — you only need the raw graph structure for those. Save clustering for the scheduled rebuild on main.

Committing the Graph or Not

ApproachProsCons
Commit graphify-out/Always available; clone and goAdds to repo size; merge conflicts
Generate on demandClean repo; no conflictsRequires API key locally; slower start
CI artifactAvailable for download; not in treeExtra step to fetch before sessions

For most teams, committing the graph is best — it means git clone gives you everything needed to start an AI-assisted session immediately.

Verify Your Understanding

Why use --no-cluster in CI drift detection workflows?

What's the main advantage of committing graphify-out/ to the repo?

Do This Now

  1. Decide: commit the graph or generate on demand?
  2. Add one of the workflow files above to your project's .github/workflows/
  3. If committing: add graphify-out/ to version control and push
  4. Open a test PR and verify the workflow runs