PostgreSQL Performance: A Practical Guide to Indexes, Query Plans, and Pooling
Cover: gradient generated for Devgains
Postgres performance work almost always comes down to three levers, in this order:
read what the planner is actually doing with EXPLAIN, make sure the right indexes
exist so it can avoid scanning whole tables, and stop your application from opening more
connections than the database can serve. Get those three right and most "the database
is slow" incidents disappear — usually without a bigger instance. This guide is the hub
for the Devgains database cluster: it gives you the mental model, then points to the
deep dives for each piece.
The reason performance tuning feels like guesswork is that people change things they
cannot measure. They add an index and hope, bump work_mem because a blog said so, or
scale the instance up because CPU looks high. PostgreSQL is far more knowable than that.
It will tell you exactly how it plans to run a query, why a query is slow, and what it
expects each step to cost — if you ask it.
Quick answer: how do you improve Postgres performance?
To improve Postgres performance, work the slow query, not the server:
- Measure first. Run
EXPLAIN (ANALYZE, BUFFERS)on the slow query and read the plan bottom-up. Find the node that eats the time. - Index the predicate. If you see a
Seq Scanfiltering most rows away, add an index on the column(s) in theWHERE/JOIN/ORDER BY. - Cut round trips. Collapse N+1 query loops into a single set-based query.
- Pool connections. Put PgBouncer in front of the database so hundreds of clients share a small number of backends.
- Re-measure. Confirm the plan changed and the latency dropped. Never trust the fix until the plan proves it.
Everything below is an expansion of those five steps.
Why it matters
A slow database is rarely a slow database — it is usually an application asking it to do unnecessary work. Postgres on modest hardware comfortably serves tens of thousands of well-planned queries per second. The same instance falls over at a few hundred badly-planned ones, because a single missing index turns a millisecond index lookup into a multi-hundred-millisecond sequential scan, and a connection storm turns spare CPU into lock contention and context-switching.
That gap — between what the hardware can do and what a careless query pattern lets it do — is where almost all the cost and almost all the outages live. Tuning it is some of the highest-leverage work in a backend, and it is cheap: a one-line index or a pooler config often beats a month of vertical scaling.
The mental model: how Postgres runs a query
Every query goes through the planner, which estimates the cheapest way to produce the rows you asked for using the statistics it keeps about each table. It chooses between access methods — and naming them is half the battle:
| Plan node | What it does | When it's good | When it's a smell |
|---|---|---|---|
| Seq Scan | Reads the whole table | Small tables, or you need most rows | Large table + selective WHERE → missing index |
| Index Scan | Walks an index, then fetches matching rows from the heap | Selective predicate returning few rows | Returning most of the table (heap fetches dominate) |
| Index Only Scan | Answers entirely from the index, skips the heap | All needed columns are in the index | — (this is the goal; see below) |
| Bitmap Heap Scan | Builds a bitmap of matching pages, then reads them in order | Medium selectivity, multiple indexes | Often fine; tune if it reads too many pages |
| Nested Loop / Hash / Merge Join | Three join strategies | Planner picks by row estimates | Wrong choice usually means stale statistics |
The single most important habit is reading the plan. EXPLAIN shows the planner's
estimate; EXPLAIN ANALYZE actually runs the query and shows real timings and real row
counts. The gap between estimated and actual rows is your best clue:
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, email
FROM users
WHERE created_at >= now() - interval '7 days';Seq Scan on users (cost=0.00..18334.00 rows=812 width=36)
(actual time=0.014..142.881 rows=796 loops=1)
Filter: (created_at >= (now() - '7 days'::interval))
Rows Removed by Filter: 999204
Buffers: shared hit=8334
Planning Time: 0.121 ms
Execution Time: 142.948 msRead it bottom-up. The Seq Scan walked the whole table — Buffers: shared hit=8334
pages — and threw away 999,204 rows to keep 796. That Rows Removed by Filter line is the
tell: the database did a million rows of work to return eight hundred. This query wants an
index.
Step by step: from Seq Scan to Index Scan
Add an index on the column in the predicate:
CREATE INDEX CONCURRENTLY idx_users_created_at ON users (created_at);CONCURRENTLY builds the index without taking a write lock on the table, so it is safe to
run on a live system (it is slower and can't run inside a transaction block). Re-run the
same EXPLAIN (ANALYZE, BUFFERS) and the plan flips:
Index Scan using idx_users_created_at on users
(cost=0.42..32.86 rows=812 width=36)
(actual time=0.028..0.391 rows=796 rows=796 loops=1)
Index Cond: (created_at >= (now() - '7 days'::interval))
Buffers: shared hit=12
Planning Time: 0.144 ms
Execution Time: 0.447 msFrom 8,334 buffers to 12, and from 143 ms to under half a millisecond. That is the entire game in miniature: the index let Postgres jump straight to the matching rows instead of reading the whole table. The mental model behind why this works — and why every index you add taxes your writes — is worth internalizing before you add indexes reflexively.
If you also only ever select columns that live in the index, Postgres can skip the table heap entirely and return an index-only scan, the fastest read it offers. A covering index makes that explicit:
-- Now SELECT id, email ... can be answered from the index alone.
CREATE INDEX idx_users_created_at_covering
ON users (created_at) INCLUDE (id, email);Index the columns you filter and join on, in the order you use them. A composite index on
(tenant_id, created_at) serves WHERE tenant_id = ? AND created_at > ? and
WHERE tenant_id = ?, but not WHERE created_at > ? alone — the leftmost column has
to be in the predicate. This left-to-right rule explains most "why is my index ignored?"
surprises.
The query pattern that no index can save: N+1
Indexes fix individual slow queries. They do nothing for the most common application-level performance bug, where the query is fast but you run it hundreds of times. An ORM that lazy-loads a relation inside a loop fires one query per row — the N+1 query problem:
SELECT * FROM posts WHERE author_id = 1; -- the "1"
SELECT * FROM comments WHERE post_id = 10; -- + N
SELECT * FROM comments WHERE post_id = 11;
SELECT * FROM comments WHERE post_id = 12;
... 200 moreEach query is sub-millisecond and perfectly indexed, so nothing looks wrong — yet the page
takes a second because of round-trip latency multiplied by N. The fix is to ask for
everything in one set-based query (a JOIN or WHERE post_id = ANY($1)), not to tune any
single statement. No index makes 200 round trips fast; making it one round trip does.
The third lever: connections don't scale linearly
Suppose every query is indexed and N+1-free, and the database still falls over under load. The culprit is almost always connections. A PostgreSQL connection is backed by a full OS process with its own memory; a few dozen are cheap, but hundreds of concurrent connections spend more time contending for locks and CPU than doing work. Throughput goes down as connections climb past the core count.
The fix is a pooler — typically PgBouncer — that lets thousands of application clients multiplex onto a small, fixed set of real backends:
# pgbouncer.ini — transaction pooling: a backend is handed back to the
# pool at the end of each transaction, so a handful serve many clients.
[databases]
appdb = host=127.0.0.1 port=5432 dbname=appdb
[pgbouncer]
pool_mode = transaction
max_client_conn = 5000 # clients that may connect to PgBouncer
default_pool_size = 20 # real Postgres backends per databaseFive thousand clients, twenty backends. A good starting point for default_pool_size is
roughly ((core_count * 2) + effective_spindle_count) — small, deliberately. The instinct
to raise it under load is exactly backwards; a smaller pool of busy connections beats a
large pool of contending ones.
Best practices
- Always
EXPLAIN (ANALYZE, BUFFERS)before and after. A fix you cannot see in the plan is a guess.BUFFERSexposes how many pages were actually touched. - Keep statistics fresh. Let autovacuum run, and
ANALYZEafter big data changes. Most "the planner picked a dumb plan" problems are stale row estimates. - Index for the workload, not the schema. Add indexes for the queries you actually
run; drop ones that aren't used (
pg_stat_user_indexesshows usage). Every index slows writes and costs storage. - Prefer composite and covering indexes over many single-column ones when queries filter on several columns together.
- Pool connections from day one. It is far easier to add PgBouncer early than to diagnose a connection storm at 2 a.m.
- Watch
pg_stat_statementsto find the queries that cost the most aggregate time — optimize the top of that list, not whatever felt slow once.
Common mistakes
- Tuning the server before the query. Bumping
shared_bufferswon't save aSeq Scanthat an index would eliminate. Fix the plan first. - Adding indexes blindly. Each one taxes every
INSERT/UPDATE/DELETE. An unused index is pure overhead. - Wrapping indexed columns in functions.
WHERE lower(email) = $1can't use a plain index onemail— you need an expression index onlower(email). - Raising the connection limit under load. More connections past the core count make contention worse, not better. Pool instead.
- Trusting
EXPLAINwithoutANALYZE. Estimates lie when statistics are stale; onlyANALYZEshows what really happened. SELECT *everywhere, which defeats index-only scans and ships columns you don't use.
Takeaways
- Three levers, in order: read the plan with
EXPLAIN (ANALYZE, BUFFERS), add the right index, pool connections. Most slowness lives in one of the three. Rows Removed by Filteron a big table = missing index. That line is your fastest diagnostic.- Indexes fix one slow query; set-based queries fix N+1. They solve different problems — don't reach for an index when the bug is too many round trips.
- Connections don't scale linearly. A small, busy pool beats a large, contending one.
- Measure before and after, every time. A fix that doesn't change the plan didn't fix anything.
Go deeper in the Database cluster: the index mental model, index-only scans, N+1 queries, connection pooling, and when JSONB helps vs. bites.
FAQ
What is the fastest way to improve Postgres performance? Run
EXPLAIN (ANALYZE, BUFFERS) on your slowest query and look for a Seq Scan with a high
Rows Removed by Filter on a large table. Adding an index on the filtered column is
usually the single biggest win, and it takes one line.
How do I read a Postgres query plan? Read it bottom-up: the innermost (most indented)
nodes run first and feed their parents. Compare estimated vs. actual rows to spot bad
estimates, and watch Buffers to see how many pages were touched. The official
EXPLAIN documentation
walks through each node type.
When should I add an index vs. rewrite the query? Add an index when a single query scans a large table to return few rows. Rewrite the query when you're running many small queries in a loop (N+1) — no index fixes round-trip count, only collapsing them into one set-based query does.
Why does my database slow down with more users even though queries are fast? Almost always connections. Each Postgres connection is a full process; past your core count they contend more than they work. A connection pooler like PgBouncer multiplexes many clients onto a few backends and restores throughput.
Do more indexes always make things faster? No. Indexes speed reads but slow every
write and consume storage, because each one must be updated on INSERT/UPDATE/DELETE.
Index for the queries you actually run and drop unused ones.
Conclusion
Postgres performance is not folklore. The database hands you a precise account of what it
plans to do and what it actually did — your job is to read it, give the planner the
indexes it needs to avoid scanning whole tables, stop your application from making
needless round trips, and keep connection counts sane with a pooler. Work those three
levers in order, measure before and after with EXPLAIN (ANALYZE, BUFFERS), and you will
fix the overwhelming majority of slow-database problems on the hardware you already have.
For the canonical references, see the PostgreSQL docs on
indexes,
using EXPLAIN, and the
number of database connections
wiki page, plus the PgBouncer documentation for
pooling configuration.

