Scheduled agents are powerful — they work while you sleep. But they also spend while you sleep. Without cost controls, a daily workflow can quietly accumulate hundreds of dollars per month in AI credits and Actions minutes. This lesson gives you the tools to prevent runaway costs and gain visibility into what your agents are actually doing.
Every workflow run incurs two costs:
A single run might cost $5. That seems fine. But $5 × 365 days = $1,825/year. And that's one workflow. Most teams end up with 5–15 scheduled workflows. Without budgets, the default behavior is uncapped spending that scales linearly with time.
The core principle: every scheduled workflow must have an explicit budget. Never rely on defaults for production automation.
max-ai-creditsThe max-ai-credits frontmatter field sets a hard ceiling on AI credit consumption per run. If the workflow reaches this limit, it stops immediately — mid-analysis if necessary.
---
description: "Daily code review suggestions"
on:
schedule: daily around 8am on weekdays
engine:
id: copilot
model: gpt-4.1-mini
max-ai-credits: 200
max-daily-ai-credits: 800
safe-outputs:
create-issue:
title-prefix: "[daily-review] "
close-older-issues: true
---
Key behaviors of max-ai-credits:
max-ai-credits: 200 can spend 200 per day.max-daily-ai-credits as a second layer of protection against multiple retries or manual triggers burning through budget.Not every workflow needs the most capable model. Choosing the right model for the task is the single biggest cost lever you have.
| Model | Best For | Relative Cost |
|---|---|---|
gpt-4.1-mini |
Scanning, summarization, pattern matching, routine monitoring | $ (cheapest) |
claude-haiku-4-5 |
Fast classification, simple analysis, structured extraction | $ |
gpt-4.1 |
Code generation, complex reasoning, multi-step analysis | $$$ |
claude-sonnet-4 |
Nuanced writing, architectural analysis, code review | $$$ |
o3 / claude-opus-4 |
Multi-step reasoning, complex refactoring, research | $$$$$ |
Rule of thumb: start every scheduled workflow with the cheapest model that produces acceptable output. Upgrade only when you see quality failures in the logs.
Beyond model selection, four levers reduce cost per run:
| Lever | How | Impact |
|---|---|---|
| Tighter prompts | Remove preamble, examples, and instruction the model doesn't need. Be specific about output format. | 20–40% fewer input tokens |
| Fewer output tokens | Ask for concise output. "Summarize in 3 bullets" costs less than "write a detailed report." | 30–60% fewer output tokens |
| Skip-if conditions | Add skip-if logic so the workflow doesn't run when there's nothing to analyze (no new commits, no open PRs). |
Eliminates entire runs |
| Scoped file reads | Use permissions: contents: read with path filters instead of reading the entire repo. |
50–80% less context |
---
on:
schedule: daily around 9am on weekdays
skip-if:
no-commits-since: 24h
no-open-prs: true
---
# Only runs if there's actually something to review
gh aw logsYou can't optimize what you can't measure. The gh aw logs command shows exactly what each run consumed:
$ gh aw logs daily-review --last 7
Run ID Date Duration AIC Used Status
───────── ────────── ──────── ──────── ──────
a3f2c1e Jun 28 42s 148 ✓ success
b7d4e2a Jun 27 38s 135 ✓ success
c9a1f3b Jun 26 1m12s 312 ✓ success ← spike
d2e5a4c Jun 25 35s 127 ✓ success
e4b6c7d Jun 24 40s 141 ✓ success
f1a8d9e Jun 23 0s 0 ⊘ skipped (no-commits)
g3c2e1f Jun 22 0s 0 ⊘ skipped (weekend)
Key things to look for:
gh aw logs <run-id> --detail.gh aw auditWhen a workflow fails or produces unexpected output, gh aw audit shows the full execution trace:
$ gh aw audit daily-review --run c9a1f3b
Run: c9a1f3b (Jun 26, 1m12s, 312 AIC)
Status: success (but over typical budget)
Timeline:
00:00 Started. Model: gpt-4.1-mini
00:02 Read 14 files (src/api/*.ts) — 8,200 tokens
00:08 Read 6 files (tests/*.ts) — 4,100 tokens ← unusual
00:15 Model response: 2,800 tokens (analysis)
00:42 Model response: 1,200 tokens (issue body)
01:12 Output: created issue #247
Token breakdown:
Input: 18,400 tokens (context + prompt)
Output: 4,000 tokens (responses)
Total: 312 AIC
Note: Input was 2x normal due to test file reads triggered by
new test files added in commit a1b2c3d.
The audit trail answers "why did this run cost more?" and "what went wrong?" — essential for debugging scheduled automation that runs without supervision.
For teams running multiple workflows, CLI inspection doesn't scale. Export telemetry to your observability stack for dashboards and alerting:
telemetry:
otlp:
endpoint: https://otel.yourcompany.com:4317
headers:
Authorization: "Bearer ${secrets.OTLP_TOKEN}"
export:
- traces # Full execution spans
- metrics # AIC usage, duration, token counts
- logs # Model interactions (redacted)
alerts:
- name: budget-spike
condition: aic_used > 2 * avg(aic_used, 7d)
notify: slack:#agentic-ops
Track AIC spend per workflow, cost trends over time, model utilization. Grafana, Datadog, or any OTLP-compatible backend.
Get notified on cost spikes, repeated failures, or budget exhaustion before it becomes a monthly bill surprise.
See exactly which files were read, which model calls were made, and where time was spent — per run.
Spot workflows whose costs are creeping up as the repo grows, so you can optimize before the bill lands.
Cost control isn't just about spending less — it's about spending well. A workflow that costs $3/day but whose output is ignored every time is wasting $90/month. Track whether your automation actually helps:
$ gh aw stats daily-review --last 30d
Runs: 22 (8 skipped)
Avg cost: 145 AIC ($1.45/run)
Monthly spend: ~$32
Acceptance rate: 73% (issues read within 4h)
Action rate: 45% (led to a commit or PR within 24h)
Noop rate: 18%
A workflow with a low action rate isn't necessarily bad — a security scanner that finds nothing 90% of the time is doing its job. But a daily summary that's never read should be made weekly or killed.
Prevent duplicate runs from piling up — especially when a workflow is triggered by both schedule and manual dispatch, or when a slow run overlaps with the next scheduled execution:
---
concurrency:
group: daily-review-${branch}
cancel-in-progress: true
---
Behaviors:
group — runs with the same group name are mutually exclusive. Only one runs at a time.cancel-in-progress: true — if a new run starts while an old one is still going, the old one is cancelled. This prevents stacking.When you first deploy a scheduled workflow — or after making significant changes — use staged: true to preview what it would do without actually publishing outputs:
---
staged: true # Runs the full workflow but doesn't publish outputs
safe-outputs:
create-issue:
title-prefix: "[daily-review] "
---
In staged mode:
max-ai-credits low)gh aw logs <run-id> --staged-outputstaged: true to go liveThis is your safety net: validate cost, quality, and relevance before committing to daily spend.
As you add scheduled workflows (the background agents from Lesson 5), cost control moves from "nice to have" to essential infrastructure. Here's the progression:
| Stage | What to Do |
|---|---|
| First workflow | Set max-ai-credits, use cheapest model, enable staged: true |
| 2–5 workflows | Add max-daily-ai-credits, check gh aw logs weekly, tune budgets down |
| 5+ workflows | Set up OTLP export, create a cost dashboard, add spike alerts |
| Team-wide adoption | Establish org-level budget policies, review outcomes monthly, kill low-value workflows |
max-ai-credits on every scheduled workflow (never rely on the 1000 default)max-daily-ai-credits as a safety net against retries and manual triggersgpt-4.1-mini or claude-haiku-4-5 for monitoring)skip-if conditions to avoid running when there's nothing to analyzeconcurrency config to prevent overlapping runsstaged: true first — review output before going livegh aw logs weekly for cost spikes and unnecessary runsmax-ai-credits limit?max-ai-credits is a hard stop. When the budget is exhausted, the workflow terminates immediately and no outputs are published. This prevents a runaway analysis from producing incomplete or misleading results.max-ai-credits is a hard ceiling — when reached, the run stops immediately with no partial outputs published. It doesn't downgrade models, pause for approval, or charge a penalty. The workflow simply terminates to protect your budget.staged: true in a workflow?gh aw logs <run-id> --staged-output and remove staged: true once you're confident in the quality and cost.gh aw logs shows actual AIC consumed per run, which you can compare across workflows. For teams with many workflows, OTLP dashboards aggregate this automatically. Note that max-ai-credits is the budget cap, not actual spend — a workflow might use far less than its limit.max-ai-credits value is just the budget cap — actual spend may be much lower. To find the most expensive workflow, check gh aw logs for actual AIC usage per run, or set up OTLP dashboards that aggregate cost metrics across all workflows. The billing page exists but doesn't break down by individual workflow.You now know how to keep your agentic workflows on a budget and how to see exactly what they're doing. Cost controls and observability aren't optional add-ons — they're the foundation that makes all your other automation sustainable. The next lesson covers testing and validation — how to verify that your workflows produce correct outputs before trusting them in production.
Cost Management Reference — complete documentation on max-ai-credits, max-daily-ai-credits, model pricing, OTLP export configuration, and budget alerting.