How much money is the average team wasting on AI coding tools?

Across 12 audits in Q2 2026 the median wasted spend was $14,200 per quarter, with the worst case at $61,000 per quarter on a 90-engineer org. That is roughly 30 percent of the average team's AI coding tool budget being burned on configuration mistakes that take half a day to fix.

What is the single highest-impact fix for Claude Code spend?

A per-task token budget hook capped at 200,000 tokens. Zero of 12 audited teams had one before the audit. Adding it dropped tail-of-bill spend 40 to 60 percent within two weeks with no developer complaints.

Is it worth running Opus instead of Sonnet or Haiku?

Only for tasks that genuinely require reasoning across many files. Ten of 12 audited teams were running Opus on tasks Haiku would have finished in 12 seconds. Routing by task size cuts model cost 35 to 50 percent in the first month.

How do I know if my MCP server setup is leaking money or risk?

Eight of 12 audited teams had MCP servers nobody could justify in writing. The audit rule: every server needs a named owner and a business case in fewer than six words. Quarterly review. If a row has no owner, the server is removed.

What is the five-day tune-up plan?

Day 1: install four-line hook set blocking destructive defaults. Day 2: wire 200K-token budget guard. Day 3: enable prompt caching. Day 4: inventory MCP servers, remove orphans. Day 5: refresh every CLAUDE.md. Teams completing this saw 22 to 38 percent lower spend with zero output loss.

Back to Blog

May 3, 2026AI + Cloud7 min read

IAudited12AICodingSetupsinQ2.EverySingleOneWasLeaking5Figures.

Claude CodeAI CodingAuditCost OptimizationHooksConsultingEngineering Leadership

In the last 90 days I sat down with 12 engineering teams who had rolled out Claude Code, Cursor, or both at company scale. Series A through public, four to two hundred engineers, US and EMEA. The brief was the same every time: "We are spending a lot. Is the spend matching the output?"

The short answer was no, twelve times out of twelve. The long answer is below.

Median wasted spend per team, per month: $14,200. Lowest was $3,800. Highest was $61,000 at a 90-person org running Claude Max + Cursor + a self-hosted MCP fleet with nothing wired to a budget alarm.

(Numbers below are composites. Specific tenants are anonymized. The patterns repeat.)

TL;DR

12 of 12 teams had no per-task token budget configured. They were one bad agent loop away from a four-figure overnight.
10 of 12 teams were running Opus on tasks Haiku would have finished in 12 seconds.
9 of 12 were paying for two coding assistants whose use cases overlapped 80%.
8 of 12 had MCP servers running on every developer's machine that nobody in the room could name.
7 of 12 had no pre-tool-use hooks. terraform destroy was one prompt away from production.
6 of 12 were re-uploading the same 40MB of repo context to every fresh session because prompt caching was not enabled.
5 of 12 had a CLAUDE.md that was last touched in October 2025. The repo had moved on. The agent had not.

Run the same math on your team in 30 seconds.

The seven patterns

1. No per-task token budget

Twelve teams. Zero budgets.

When a Claude Code session goes into a tight loop on a flaky test or a misconfigured MCP, it does not stop. It iterates. The bill arrives 14 days later, and at that point you are reading the receipt, not writing it.

The fix is one hook. Reject the tool call when cumulative tokens for the task exceed your threshold. Log the violation. Done.

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": ".*",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/budget-guard.sh 200000"
          }
        ]
      }
    ]
  }
}

200K tokens per task is generous for most work. The teams that adopted this in the audit reported zero developer complaints and a 40 to 60 percent drop in tail-of-bill spend within two weeks.

2. Opus on Haiku-shaped work

Renaming a variable. Writing a regex. Drafting a release note. Searching a four-file Python module.

If the task fits in a paragraph and does not require reasoning across files, you do not need a frontier model. You need Haiku. The price difference is roughly 12x. The latency difference is roughly 4x. The quality difference, for these tasks, is invisible.

The audit pattern was always the same. One config, one model, every task. The fix is routing: cheap default, expensive when the task plan asks for it. Most teams cut model cost by 35 to 50 percent the first month they routed.

3. Two assistants, one job

Six of the 12 teams paid for Claude Code, Cursor, and Copilot. Three paid for two of the three.

That is not pluralism. That is overlap. Every developer ends up with a favorite, and the org pays three times for the same outcome. The honest audit question is not which one is best. It is which one your senior engineers reach for at 11 PM on a Wednesday when something is broken. That is the one. Cancel the others, or sunset them by quarter.

4. MCP server sprawl

Eight teams were running MCP servers that nobody could justify. Filesystem servers with sudo. Browser automation servers nobody had configured. A Postgres MCP pointing at staging credentials checked in to a .mcp.json in a public fork.

The fix is a quarterly MCP review. Three columns: server name, business case, owner. If a row has no owner, the server is gone by Friday. If the business case fits in fewer than six words, it stays. Otherwise, it is a candidate for removal.

5. No pre-tool-use hooks

A Claude Code agent without hooks is an intern with sudo. It will not be malicious. It will be confidently wrong about rm -rf, git push --force, or terraform destroy, and you will not know until your on-call wakes up.

The minimum viable hook set is four lines:

Block rm -rf outside /tmp and the repo's gitignored paths
Block git push --force to any branch matching main|master|prod*|release*
Block terraform destroy, kubectl delete namespace, aws s3 rb
Require human confirmation on anything touching .github/workflows/, terraform/, or migrations

Every audit that turned up an incident had no hooks. Every audit that had no incident had at least these four. It is not a coincidence.

6. Prompt caching off

Anthropic's prompt caching cuts the cost of repeated context by up to 90 percent. It is one config flag. Six teams in the audit were not using it, and were paying full freight to re-upload the same CLAUDE.md, the same repo map, and the same skill definitions on every session.

For a team of 20 with 6 sessions a day each, that is roughly $1,800 a month in pure waste. For a team of 60, you are looking at $5,000+. The fix takes nine minutes.

7. Stale CLAUDE.md

The CLAUDE.md file is the agent's onboarding doc. When it goes stale, the agent makes up the rest. You can spot a stale CLAUDE.md by counting how often the agent suggests a file path that does not exist, or a script that has been renamed, or a deploy command that the team retired in February.

The fix is a quarterly review on the calendar. Same cadence as a runbook audit. Twenty minutes per repo. Nobody has done this in any of the 12 audits.

The honest comparison

Across the 12 teams, the median spend was $48K per quarter on AI coding tools. The median waste was $14.2K per quarter. That is 30 percent of the line item, every quarter, doing nothing.

I am not making the argument that you should spend less. I am making the argument that you should spend the same and ship 30 percent more, or spend 30 percent less and ship the same. The decision is yours. The leak is real.

What a one-week tune-up looks like

If you do nothing else this month, do these five things in order. Each is a single sitting.

Day 1. Add the four-line hook set to every repo's settings.json. Block destructive defaults.
Day 2. Wire a per-task budget hook at 200K tokens. Log violations.
Day 3. Enable prompt caching. Verify on the next session.
Day 4. Inventory MCP servers. Remove anything without an owner.
Day 5. Refresh every CLAUDE.md you have. Twenty minutes per repo.

Conservative pre-and-post on the teams who completed this sequence: 22 to 38 percent lower spend, zero loss in shipped output, two near-miss incidents avoided in the following 30 days.

What I am not telling you

I am not telling you to throw out your AI coding stack. I am telling you to operate it like a piece of infrastructure. Budgets. Hooks. Caching. Reviews. The teams that treat Claude Code like a contractor with sudo are the teams whose CFOs ask hard questions in October.

If you want a second pair of eyes on your setup, I do these audits. Two-week engagement, fixed fee, the deliverable is the spreadsheet plus the hooks. You keep the savings.

Receipts

12 audits across 90 days, March through May 2026.
Median team size: 22 engineers.
Median quarterly spend on AI coding tools: $48,000.
Median quarterly waste: $14,200.
Most common single fix: budget hook. Saves $3,000 to $9,000 per month on the median team.
Least adopted single fix: quarterly CLAUDE.md review. Zero teams had it.

The leak is not in the tool. The leak is in the operations around the tool. Fix the operations.

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

May 10, 2026AI + Cloud7 min read

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

May 9, 2026AI + Cloud5 min read

The 7 Claude Code Hooks Every Production Team Should Be Running

May 9, 2026AI + Cloud5 min read

Stay ahead of the curve

Get new posts on AI, cloud engineering, and the future of tech delivered to your inbox.

All Posts

Back to Blog

May 3, 2026AI + Cloud7 min read

IAudited12AICodingSetupsinQ2.EverySingleOneWasLeaking5Figures.

Claude CodeAI CodingAuditCost OptimizationHooksConsultingEngineering Leadership

The short answer was no, twelve times out of twelve. The long answer is below.

(Numbers below are composites. Specific tenants are anonymized. The patterns repeat.)

TL;DR

12 of 12 teams had no per-task token budget configured. They were one bad agent loop away from a four-figure overnight.
10 of 12 teams were running Opus on tasks Haiku would have finished in 12 seconds.
9 of 12 were paying for two coding assistants whose use cases overlapped 80%.
8 of 12 had MCP servers running on every developer's machine that nobody in the room could name.
7 of 12 had no pre-tool-use hooks. terraform destroy was one prompt away from production.
6 of 12 were re-uploading the same 40MB of repo context to every fresh session because prompt caching was not enabled.
5 of 12 had a CLAUDE.md that was last touched in October 2025. The repo had moved on. The agent had not.

Run the same math on your team in 30 seconds.

The seven patterns

1. No per-task token budget

Twelve teams. Zero budgets.

The fix is one hook. Reject the tool call when cumulative tokens for the task exceed your threshold. Log the violation. Done.

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": ".*",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/budget-guard.sh 200000"
          }
        ]
      }
    ]
  }
}

200K tokens per task is generous for most work. The teams that adopted this in the audit reported zero developer complaints and a 40 to 60 percent drop in tail-of-bill spend within two weeks.

2. Opus on Haiku-shaped work

Renaming a variable. Writing a regex. Drafting a release note. Searching a four-file Python module.

3. Two assistants, one job

Six of the 12 teams paid for Claude Code, Cursor, and Copilot. Three paid for two of the three.

4. MCP server sprawl

5. No pre-tool-use hooks

The minimum viable hook set is four lines:

Block rm -rf outside /tmp and the repo's gitignored paths
Block git push --force to any branch matching main|master|prod*|release*
Block terraform destroy, kubectl delete namespace, aws s3 rb
Require human confirmation on anything touching .github/workflows/, terraform/, or migrations

Every audit that turned up an incident had no hooks. Every audit that had no incident had at least these four. It is not a coincidence.

6. Prompt caching off

For a team of 20 with 6 sessions a day each, that is roughly $1,800 a month in pure waste. For a team of 60, you are looking at $5,000+. The fix takes nine minutes.

7. Stale CLAUDE.md

The fix is a quarterly review on the calendar. Same cadence as a runbook audit. Twenty minutes per repo. Nobody has done this in any of the 12 audits.

The honest comparison

Across the 12 teams, the median spend was $48K per quarter on AI coding tools. The median waste was $14.2K per quarter. That is 30 percent of the line item, every quarter, doing nothing.

What a one-week tune-up looks like

If you do nothing else this month, do these five things in order. Each is a single sitting.

Day 1. Add the four-line hook set to every repo's settings.json. Block destructive defaults.
Day 2. Wire a per-task budget hook at 200K tokens. Log violations.
Day 3. Enable prompt caching. Verify on the next session.
Day 4. Inventory MCP servers. Remove anything without an owner.
Day 5. Refresh every CLAUDE.md you have. Twenty minutes per repo.

Conservative pre-and-post on the teams who completed this sequence: 22 to 38 percent lower spend, zero loss in shipped output, two near-miss incidents avoided in the following 30 days.

What I am not telling you

If you want a second pair of eyes on your setup, I do these audits. Two-week engagement, fixed fee, the deliverable is the spreadsheet plus the hooks. You keep the savings.

Receipts

12 audits across 90 days, March through May 2026.
Median team size: 22 engineers.
Median quarterly spend on AI coding tools: $48,000.
Median quarterly waste: $14,200.
Most common single fix: budget hook. Saves $3,000 to $9,000 per month on the median team.
Least adopted single fix: quarterly CLAUDE.md review. Zero teams had it.

The leak is not in the tool. The leak is in the operations around the tool. Fix the operations.

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

May 10, 2026AI + Cloud7 min read

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

May 9, 2026AI + Cloud5 min read

The 7 Claude Code Hooks Every Production Team Should Be Running

May 9, 2026AI + Cloud5 min read

Stay ahead of the curve

Get new posts on AI, cloud engineering, and the future of tech delivered to your inbox.

All Posts

IAudited12AICodingSetupsinQ2.EverySingleOneWasLeaking5Figures.

TL;DR

The seven patterns

1. No per-task token budget

2. Opus on Haiku-shaped work

3. Two assistants, one job

4. MCP server sprawl

5. No pre-tool-use hooks

6. Prompt caching off

7. Stale CLAUDE.md

The honest comparison

What a one-week tune-up looks like

What I am not telling you

Receipts

Related Posts

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

The 7 Claude Code Hooks Every Production Team Should Be Running

Stay ahead of the curve

IAudited12AICodingSetupsinQ2.EverySingleOneWasLeaking5Figures.

TL;DR

The seven patterns

1. No per-task token budget

2. Opus on Haiku-shaped work

3. Two assistants, one job

4. MCP server sprawl

5. No pre-tool-use hooks

6. Prompt caching off

7. Stale CLAUDE.md

The honest comparison

What a one-week tune-up looks like

What I am not telling you

Receipts

Related Posts

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

The 7 Claude Code Hooks Every Production Team Should Be Running

Stay ahead of the curve