In the last 90 days I sat down with 12 engineering teams who had rolled out Claude Code, Cursor, or both at company scale. Series A through public, four to two hundred engineers, US and EMEA. The brief was the same every time: "We are spending a lot. Is the spend matching the output?"
The short answer was no, twelve times out of twelve. The long answer is below.
Median wasted spend per team, per month: $14,200. Lowest was $3,800. Highest was $61,000 at a 90-person org running Claude Max + Cursor + a self-hosted MCP fleet with nothing wired to a budget alarm.
(Numbers below are composites. Specific tenants are anonymized. The patterns repeat.)
TL;DR
- 12 of 12 teams had no per-task token budget configured. They were one bad agent loop away from a four-figure overnight.
- 10 of 12 teams were running Opus on tasks Haiku would have finished in 12 seconds.
- 9 of 12 were paying for two coding assistants whose use cases overlapped 80%.
- 8 of 12 had MCP servers running on every developer's machine that nobody in the room could name.
- 7 of 12 had no pre-tool-use hooks.
terraform destroywas one prompt away from production. - 6 of 12 were re-uploading the same 40MB of repo context to every fresh session because prompt caching was not enabled.
- 5 of 12 had a
CLAUDE.mdthat was last touched in October 2025. The repo had moved on. The agent had not.
Run the same math on your team in 30 seconds.
The seven patterns
1. No per-task token budget
Twelve teams. Zero budgets.
When a Claude Code session goes into a tight loop on a flaky test or a misconfigured MCP, it does not stop. It iterates. The bill arrives 14 days later, and at that point you are reading the receipt, not writing it.
The fix is one hook. Reject the tool call when cumulative tokens for the task exceed your threshold. Log the violation. Done.
{
"hooks": {
"PreToolUse": [
{
"matcher": ".*",
"hooks": [
{
"type": "command",
"command": ".claude/hooks/budget-guard.sh 200000"
}
]
}
]
}
}
200K tokens per task is generous for most work. The teams that adopted this in the audit reported zero developer complaints and a 40 to 60 percent drop in tail-of-bill spend within two weeks.
2. Opus on Haiku-shaped work
Renaming a variable. Writing a regex. Drafting a release note. Searching a four-file Python module.
If the task fits in a paragraph and does not require reasoning across files, you do not need a frontier model. You need Haiku. The price difference is roughly 12x. The latency difference is roughly 4x. The quality difference, for these tasks, is invisible.
The audit pattern was always the same. One config, one model, every task. The fix is routing: cheap default, expensive when the task plan asks for it. Most teams cut model cost by 35 to 50 percent the first month they routed.
3. Two assistants, one job
Six of the 12 teams paid for Claude Code, Cursor, and Copilot. Three paid for two of the three.
That is not pluralism. That is overlap. Every developer ends up with a favorite, and the org pays three times for the same outcome. The honest audit question is not which one is best. It is which one your senior engineers reach for at 11 PM on a Wednesday when something is broken. That is the one. Cancel the others, or sunset them by quarter.
4. MCP server sprawl
Eight teams were running MCP servers that nobody could justify. Filesystem servers with sudo. Browser automation servers nobody had configured. A Postgres MCP pointing at staging credentials checked in to a .mcp.json in a public fork.
The fix is a quarterly MCP review. Three columns: server name, business case, owner. If a row has no owner, the server is gone by Friday. If the business case fits in fewer than six words, it stays. Otherwise, it is a candidate for removal.
5. No pre-tool-use hooks
A Claude Code agent without hooks is an intern with sudo. It will not be malicious. It will be confidently wrong about rm -rf, git push --force, or terraform destroy, and you will not know until your on-call wakes up.
The minimum viable hook set is four lines:
- Block
rm -rfoutside/tmpand the repo's gitignored paths - Block
git push --forceto any branch matchingmain|master|prod*|release* - Block
terraform destroy,kubectl delete namespace,aws s3 rb - Require human confirmation on anything touching
.github/workflows/,terraform/, or migrations
Every audit that turned up an incident had no hooks. Every audit that had no incident had at least these four. It is not a coincidence.
6. Prompt caching off
Anthropic's prompt caching cuts the cost of repeated context by up to 90 percent. It is one config flag. Six teams in the audit were not using it, and were paying full freight to re-upload the same CLAUDE.md, the same repo map, and the same skill definitions on every session.
For a team of 20 with 6 sessions a day each, that is roughly $1,800 a month in pure waste. For a team of 60, you are looking at $5,000+. The fix takes nine minutes.
7. Stale CLAUDE.md
The CLAUDE.md file is the agent's onboarding doc. When it goes stale, the agent makes up the rest. You can spot a stale CLAUDE.md by counting how often the agent suggests a file path that does not exist, or a script that has been renamed, or a deploy command that the team retired in February.
The fix is a quarterly review on the calendar. Same cadence as a runbook audit. Twenty minutes per repo. Nobody has done this in any of the 12 audits.
The honest comparison
Across the 12 teams, the median spend was $48K per quarter on AI coding tools. The median waste was $14.2K per quarter. That is 30 percent of the line item, every quarter, doing nothing.
I am not making the argument that you should spend less. I am making the argument that you should spend the same and ship 30 percent more, or spend 30 percent less and ship the same. The decision is yours. The leak is real.
What a one-week tune-up looks like
If you do nothing else this month, do these five things in order. Each is a single sitting.
- Day 1. Add the four-line hook set to every repo's
settings.json. Block destructive defaults. - Day 2. Wire a per-task budget hook at 200K tokens. Log violations.
- Day 3. Enable prompt caching. Verify on the next session.
- Day 4. Inventory MCP servers. Remove anything without an owner.
- Day 5. Refresh every
CLAUDE.mdyou have. Twenty minutes per repo.
Conservative pre-and-post on the teams who completed this sequence: 22 to 38 percent lower spend, zero loss in shipped output, two near-miss incidents avoided in the following 30 days.
What I am not telling you
I am not telling you to throw out your AI coding stack. I am telling you to operate it like a piece of infrastructure. Budgets. Hooks. Caching. Reviews. The teams that treat Claude Code like a contractor with sudo are the teams whose CFOs ask hard questions in October.
If you want a second pair of eyes on your setup, I do these audits. Two-week engagement, fixed fee, the deliverable is the spreadsheet plus the hooks. You keep the savings.
Receipts
- 12 audits across 90 days, March through May 2026.
- Median team size: 22 engineers.
- Median quarterly spend on AI coding tools: $48,000.
- Median quarterly waste: $14,200.
- Most common single fix: budget hook. Saves $3,000 to $9,000 per month on the median team.
- Least adopted single fix: quarterly
CLAUDE.mdreview. Zero teams had it.
The leak is not in the tool. The leak is in the operations around the tool. Fix the operations.