May 9, 2026Cloud5 min read

TheMissingIndexPatternKilling80%ofPostgresApps(AndHowtoFindYoursin5Minutes)

PostgresPostgreSQLDatabaseIndexPerformanceEXPLAIN ANALYZECost Optimization

A Postgres database that performs well in development and falls over in production is almost always missing an index. The same five-mistake list shows up in every audit. They are easy to find, easy to fix, and easy to ignore.

This is the field guide. The tool that does the analysis automatically is at /tools/pg-index-recommender.

The five missing-index patterns

Pattern 1: Seq Scan on a large table with a Filter

The most common. The EXPLAIN output looks like:

Seq Scan on orders  (cost=0.00..18334.00 rows=124 width=72)
  Filter: ((status = 'pending') AND (created_at > '2026-04-01'))
  Rows Removed by Filter: 1248321

What this tells you: Postgres scanned 1.2 million rows to find 124. There is no index on (status, created_at).

The fix:

CREATE INDEX CONCURRENTLY idx_orders_status_created
  ON orders (status, created_at);

The order matters. Put the higher-cardinality column first if both are always filtered. Put the equality column first if one is range, one is equality (Postgres can use the leading column for equality and the second column for range, but not the other way around).

Pattern 2: Sort spilled to disk

Sort  (cost=...)
  Sort Key: created_at DESC
  Sort Method: external merge  Disk: 32768kB

What this tells you: the result set is too large to sort in memory (work_mem). Postgres wrote 32 MB to disk and sorted it there. Disk is slow.

Two fixes:

Add an index ordered the same way as the ORDER BY. Postgres can stream rows out of the index without sorting.
Increase work_mem if many queries hit this pattern. Be careful: work_mem is per-operation, per-connection, and can multiply.

Pattern 3: Hash Join with a high cost

Hash Join  (cost=23.40..534210.00 rows=10000 width=72)
  Hash Cond: (a.user_id = b.id)

What this tells you: Postgres is joining via hash, which builds a hash table of the inner side. If b.id is a primary key (it usually is) and a.user_id has no index, this is going to be expensive at scale.

The fix:

CREATE INDEX CONCURRENTLY idx_a_user_id
  ON a (user_id);

Foreign key columns should almost always have indexes. Postgres does not create them automatically (unlike some other databases). This is the single most common audit finding.

Pattern 4: Index Scan with a low rowcount but high latency

Index Scan using idx_users_email on users  (cost=0.42..8.44 rows=1 width=72)
  (actual time=12.041..12.041 rows=1 loops=1)

The plan is using an index. The rowcount is 1. Why is it taking 12 ms?

Most likely the index is bloated, or the table has so much heap dead-tuple churn that the index points to many tuples that no longer match. The fix:

VACUUM ANALYZE users;
REINDEX INDEX CONCURRENTLY idx_users_email;

If the index is fine, the table itself might be on slow storage. Migrate to gp3 SSD or local NVMe.

Pattern 5: An index that exists but is not being used

The bug-trap. You added an index. EXPLAIN does not use it.

Common causes:

The index column is wrapped in a function in the query. WHERE lower(email) = $1 will not use idx_users_email unless the index is on lower(email).
The column types do not match. WHERE user_id = '123' (string) will not use the index if user_id is an integer column.
Statistics are stale. Run ANALYZE table_name and retry.
The table is too small. Postgres knows sequential scans beat index scans for small tables. Below ~5,000 rows, expect seq scans.

The 5-minute audit

Enable slow query logging:

log_min_duration_statement = 500

This logs every query taking over 500 ms.

Watch the log for 24 hours. Pick the top 10 slowest queries.
For each one, run EXPLAIN ANALYZE and feed the output into the recommender.
Apply the suggestions one at a time. Always CREATE INDEX CONCURRENTLY. Always re-run EXPLAIN ANALYZE after to confirm the planner is using the new index.
Re-check the slow query log a week later. The top 10 will have changed.

The cost of indexes

Indexes are not free. Each one:

Takes disk space (often 20 to 40 percent of the table size).
Slows down every INSERT, UPDATE, and DELETE on the indexed columns.
Has to be maintained during VACUUM.

The rule I run: index columns that appear in WHERE, JOIN, or ORDER BY clauses in queries that run more than 100 times a day. Do not index columns that only appear in ad-hoc analytical queries.

The cost of NOT indexing

A query that does a sequential scan on a 10-million-row table takes roughly 1 second on modern SSD storage. If that query runs 100 times a minute (a common API endpoint pattern), you are spending 100 CPU-seconds per minute on a query that should take 5 ms. That is more than a full CPU core, permanently, just for one missing index.

Receipts

9 Postgres audits in the last 90 days.
Median number of high-priority missing indexes found per database: 4.
Largest single saving: a 6-second query reduced to 8 ms after one index. Endpoint that called it served 2 million requests per day.
Most common cause of a "the index exists but is not being used" bug: function wrapping in the query.

Run the analyzer.

Your GitHub Actions Bill Is Hiding 30% Waste. Here's the Math.

May 9, 2026Cloud4 min read

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

May 10, 2026AI + Cloud7 min read

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

May 9, 2026AI + Cloud5 min read

Stay ahead of the curve

Get new posts on AI, cloud engineering, and the future of tech delivered to your inbox.

All Posts

Back to Blog

May 9, 2026Cloud5 min read

TheMissingIndexPatternKilling80%ofPostgresApps(AndHowtoFindYoursin5Minutes)

PostgresPostgreSQLDatabaseIndexPerformanceEXPLAIN ANALYZECost Optimization

This is the field guide. The tool that does the analysis automatically is at /tools/pg-index-recommender.

The five missing-index patterns

Pattern 1: Seq Scan on a large table with a Filter

The most common. The EXPLAIN output looks like:

Seq Scan on orders  (cost=0.00..18334.00 rows=124 width=72)
  Filter: ((status = 'pending') AND (created_at > '2026-04-01'))
  Rows Removed by Filter: 1248321

What this tells you: Postgres scanned 1.2 million rows to find 124. There is no index on (status, created_at).

The fix:

CREATE INDEX CONCURRENTLY idx_orders_status_created
  ON orders (status, created_at);

Pattern 2: Sort spilled to disk

Sort  (cost=...)
  Sort Key: created_at DESC
  Sort Method: external merge  Disk: 32768kB

What this tells you: the result set is too large to sort in memory (work_mem). Postgres wrote 32 MB to disk and sorted it there. Disk is slow.

Two fixes:

Add an index ordered the same way as the ORDER BY. Postgres can stream rows out of the index without sorting.
Increase work_mem if many queries hit this pattern. Be careful: work_mem is per-operation, per-connection, and can multiply.

Pattern 3: Hash Join with a high cost

Hash Join  (cost=23.40..534210.00 rows=10000 width=72)
  Hash Cond: (a.user_id = b.id)

The fix:

CREATE INDEX CONCURRENTLY idx_a_user_id
  ON a (user_id);

Foreign key columns should almost always have indexes. Postgres does not create them automatically (unlike some other databases). This is the single most common audit finding.

Pattern 4: Index Scan with a low rowcount but high latency

Index Scan using idx_users_email on users  (cost=0.42..8.44 rows=1 width=72)
  (actual time=12.041..12.041 rows=1 loops=1)

The plan is using an index. The rowcount is 1. Why is it taking 12 ms?

Most likely the index is bloated, or the table has so much heap dead-tuple churn that the index points to many tuples that no longer match. The fix:

VACUUM ANALYZE users;
REINDEX INDEX CONCURRENTLY idx_users_email;

If the index is fine, the table itself might be on slow storage. Migrate to gp3 SSD or local NVMe.

Pattern 5: An index that exists but is not being used

The bug-trap. You added an index. EXPLAIN does not use it.

Common causes:

The index column is wrapped in a function in the query. WHERE lower(email) = $1 will not use idx_users_email unless the index is on lower(email).
The column types do not match. WHERE user_id = '123' (string) will not use the index if user_id is an integer column.
Statistics are stale. Run ANALYZE table_name and retry.
The table is too small. Postgres knows sequential scans beat index scans for small tables. Below ~5,000 rows, expect seq scans.

The 5-minute audit

Enable slow query logging:

log_min_duration_statement = 500

This logs every query taking over 500 ms.

Watch the log for 24 hours. Pick the top 10 slowest queries.
For each one, run EXPLAIN ANALYZE and feed the output into the recommender.
Apply the suggestions one at a time. Always CREATE INDEX CONCURRENTLY. Always re-run EXPLAIN ANALYZE after to confirm the planner is using the new index.
Re-check the slow query log a week later. The top 10 will have changed.

The cost of indexes

Indexes are not free. Each one:

Takes disk space (often 20 to 40 percent of the table size).
Slows down every INSERT, UPDATE, and DELETE on the indexed columns.
Has to be maintained during VACUUM.

The rule I run: index columns that appear in WHERE, JOIN, or ORDER BY clauses in queries that run more than 100 times a day. Do not index columns that only appear in ad-hoc analytical queries.

The cost of NOT indexing

Receipts

9 Postgres audits in the last 90 days.
Median number of high-priority missing indexes found per database: 4.
Largest single saving: a 6-second query reduced to 8 ms after one index. Endpoint that called it served 2 million requests per day.
Most common cause of a "the index exists but is not being used" bug: function wrapping in the query.

Run the analyzer.

Your GitHub Actions Bill Is Hiding 30% Waste. Here's the Math.

May 9, 2026Cloud4 min read

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

May 10, 2026AI + Cloud7 min read

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

May 9, 2026AI + Cloud5 min read

Stay ahead of the curve

Get new posts on AI, cloud engineering, and the future of tech delivered to your inbox.

All Posts

TheMissingIndexPatternKilling80%ofPostgresApps(AndHowtoFindYoursin5Minutes)

The five missing-index patterns

Pattern 1: Seq Scan on a large table with a Filter

Pattern 2: Sort spilled to disk

Pattern 3: Hash Join with a high cost

Pattern 4: Index Scan with a low rowcount but high latency

Pattern 5: An index that exists but is not being used

The 5-minute audit

The cost of indexes

The cost of NOT indexing

Receipts

Related Posts

Your GitHub Actions Bill Is Hiding 30% Waste. Here's the Math.

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

Stay ahead of the curve

TheMissingIndexPatternKilling80%ofPostgresApps(AndHowtoFindYoursin5Minutes)

The five missing-index patterns

Pattern 1: Seq Scan on a large table with a Filter

Pattern 2: Sort spilled to disk

Pattern 3: Hash Join with a high cost

Pattern 4: Index Scan with a low rowcount but high latency

Pattern 5: An index that exists but is not being used

The 5-minute audit

The cost of indexes

The cost of NOT indexing

Receipts

Related Posts

Your GitHub Actions Bill Is Hiding 30% Waste. Here's the Math.

The $47K AWS Bill My Claude Code Hook Caught at 2 AM

How a Claude Code Agent Burned $3,400 Overnight (And the Math to Predict It Before It Happens)

Stay ahead of the curve