A Postgres database that performs well in development and falls over in production is almost always missing an index. The same five-mistake list shows up in every audit. They are easy to find, easy to fix, and easy to ignore.
This is the field guide. The tool that does the analysis automatically is at /tools/pg-index-recommender.
The five missing-index patterns
Pattern 1: Seq Scan on a large table with a Filter
The most common. The EXPLAIN output looks like:
Seq Scan on orders (cost=0.00..18334.00 rows=124 width=72)
Filter: ((status = 'pending') AND (created_at > '2026-04-01'))
Rows Removed by Filter: 1248321
What this tells you: Postgres scanned 1.2 million rows to find 124. There is no index on (status, created_at).
The fix:
CREATE INDEX CONCURRENTLY idx_orders_status_created
ON orders (status, created_at);
The order matters. Put the higher-cardinality column first if both are always filtered. Put the equality column first if one is range, one is equality (Postgres can use the leading column for equality and the second column for range, but not the other way around).
Pattern 2: Sort spilled to disk
Sort (cost=...)
Sort Key: created_at DESC
Sort Method: external merge Disk: 32768kB
What this tells you: the result set is too large to sort in memory (work_mem). Postgres wrote 32 MB to disk and sorted it there. Disk is slow.
Two fixes:
- Add an index ordered the same way as the ORDER BY. Postgres can stream rows out of the index without sorting.
- Increase
work_memif many queries hit this pattern. Be careful:work_memis per-operation, per-connection, and can multiply.
Pattern 3: Hash Join with a high cost
Hash Join (cost=23.40..534210.00 rows=10000 width=72)
Hash Cond: (a.user_id = b.id)
What this tells you: Postgres is joining via hash, which builds a hash table of the inner side. If b.id is a primary key (it usually is) and a.user_id has no index, this is going to be expensive at scale.
The fix:
CREATE INDEX CONCURRENTLY idx_a_user_id
ON a (user_id);
Foreign key columns should almost always have indexes. Postgres does not create them automatically (unlike some other databases). This is the single most common audit finding.
Pattern 4: Index Scan with a low rowcount but high latency
Index Scan using idx_users_email on users (cost=0.42..8.44 rows=1 width=72)
(actual time=12.041..12.041 rows=1 loops=1)
The plan is using an index. The rowcount is 1. Why is it taking 12 ms?
Most likely the index is bloated, or the table has so much heap dead-tuple churn that the index points to many tuples that no longer match. The fix:
VACUUM ANALYZE users;
REINDEX INDEX CONCURRENTLY idx_users_email;
If the index is fine, the table itself might be on slow storage. Migrate to gp3 SSD or local NVMe.
Pattern 5: An index that exists but is not being used
The bug-trap. You added an index. EXPLAIN does not use it.
Common causes:
- The index column is wrapped in a function in the query.
WHERE lower(email) = $1will not useidx_users_emailunless the index is onlower(email). - The column types do not match.
WHERE user_id = '123'(string) will not use the index ifuser_idis an integer column. - Statistics are stale. Run
ANALYZE table_nameand retry. - The table is too small. Postgres knows sequential scans beat index scans for small tables. Below ~5,000 rows, expect seq scans.
The 5-minute audit
- Enable slow query logging:
log_min_duration_statement = 500
This logs every query taking over 500 ms.
-
Watch the log for 24 hours. Pick the top 10 slowest queries.
-
For each one, run
EXPLAIN ANALYZEand feed the output into the recommender. -
Apply the suggestions one at a time. Always
CREATE INDEX CONCURRENTLY. Always re-runEXPLAIN ANALYZEafter to confirm the planner is using the new index. -
Re-check the slow query log a week later. The top 10 will have changed.
The cost of indexes
Indexes are not free. Each one:
- Takes disk space (often 20 to 40 percent of the table size).
- Slows down every INSERT, UPDATE, and DELETE on the indexed columns.
- Has to be maintained during VACUUM.
The rule I run: index columns that appear in WHERE, JOIN, or ORDER BY clauses in queries that run more than 100 times a day. Do not index columns that only appear in ad-hoc analytical queries.
The cost of NOT indexing
A query that does a sequential scan on a 10-million-row table takes roughly 1 second on modern SSD storage. If that query runs 100 times a minute (a common API endpoint pattern), you are spending 100 CPU-seconds per minute on a query that should take 5 ms. That is more than a full CPU core, permanently, just for one missing index.
Receipts
- 9 Postgres audits in the last 90 days.
- Median number of high-priority missing indexes found per database: 4.
- Largest single saving: a 6-second query reduced to 8 ms after one index. Endpoint that called it served 2 million requests per day.
- Most common cause of a "the index exists but is not being used" bug: function wrapping in the query.