DevgainsDevgainsDevgains
All articles

PostgreSQL Performance: A Practical Guide to Indexes, Query Plans, and Pooling

·11 min read·Updated Jun 30, 2026
PostgreSQL Performance: A Practical Guide to Indexes, Query Plans, and Pooling

Cover: gradient generated for Devgains

Postgres performance work almost always comes down to three levers, in this order: read what the planner is actually doing with EXPLAIN, make sure the right indexes exist so it can avoid scanning whole tables, and stop your application from opening more connections than the database can serve. Get those three right and most "the database is slow" incidents disappear — usually without a bigger instance. This guide is the hub for the Devgains database cluster: it gives you the mental model, then points to the deep dives for each piece.

The reason performance tuning feels like guesswork is that people change things they cannot measure. They add an index and hope, bump work_mem because a blog said so, or scale the instance up because CPU looks high. PostgreSQL is far more knowable than that. It will tell you exactly how it plans to run a query, why a query is slow, and what it expects each step to cost — if you ask it.

Quick answer: how do you improve Postgres performance?

To improve Postgres performance, work the slow query, not the server:

  1. Measure first. Run EXPLAIN (ANALYZE, BUFFERS) on the slow query and read the plan bottom-up. Find the node that eats the time.
  2. Index the predicate. If you see a Seq Scan filtering most rows away, add an index on the column(s) in the WHERE/JOIN/ORDER BY.
  3. Cut round trips. Collapse N+1 query loops into a single set-based query.
  4. Pool connections. Put PgBouncer in front of the database so hundreds of clients share a small number of backends.
  5. Re-measure. Confirm the plan changed and the latency dropped. Never trust the fix until the plan proves it.

Everything below is an expansion of those five steps.

Why it matters

A slow database is rarely a slow database — it is usually an application asking it to do unnecessary work. Postgres on modest hardware comfortably serves tens of thousands of well-planned queries per second. The same instance falls over at a few hundred badly-planned ones, because a single missing index turns a millisecond index lookup into a multi-hundred-millisecond sequential scan, and a connection storm turns spare CPU into lock contention and context-switching.

That gap — between what the hardware can do and what a careless query pattern lets it do — is where almost all the cost and almost all the outages live. Tuning it is some of the highest-leverage work in a backend, and it is cheap: a one-line index or a pooler config often beats a month of vertical scaling.

The mental model: how Postgres runs a query

Every query goes through the planner, which estimates the cheapest way to produce the rows you asked for using the statistics it keeps about each table. It chooses between access methods — and naming them is half the battle:

Plan nodeWhat it doesWhen it's goodWhen it's a smell
Seq ScanReads the whole tableSmall tables, or you need most rowsLarge table + selective WHERE → missing index
Index ScanWalks an index, then fetches matching rows from the heapSelective predicate returning few rowsReturning most of the table (heap fetches dominate)
Index Only ScanAnswers entirely from the index, skips the heapAll needed columns are in the index— (this is the goal; see below)
Bitmap Heap ScanBuilds a bitmap of matching pages, then reads them in orderMedium selectivity, multiple indexesOften fine; tune if it reads too many pages
Nested Loop / Hash / Merge JoinThree join strategiesPlanner picks by row estimatesWrong choice usually means stale statistics

The single most important habit is reading the plan. EXPLAIN shows the planner's estimate; EXPLAIN ANALYZE actually runs the query and shows real timings and real row counts. The gap between estimated and actual rows is your best clue:

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, email
FROM users
WHERE created_at >= now() - interval '7 days';
Seq Scan on users  (cost=0.00..18334.00 rows=812 width=36)
                   (actual time=0.014..142.881 rows=796 loops=1)
  Filter: (created_at >= (now() - '7 days'::interval))
  Rows Removed by Filter: 999204
  Buffers: shared hit=8334
Planning Time: 0.121 ms
Execution Time: 142.948 ms

Read it bottom-up. The Seq Scan walked the whole table — Buffers: shared hit=8334 pages — and threw away 999,204 rows to keep 796. That Rows Removed by Filter line is the tell: the database did a million rows of work to return eight hundred. This query wants an index.

Step by step: from Seq Scan to Index Scan

Add an index on the column in the predicate:

CREATE INDEX CONCURRENTLY idx_users_created_at ON users (created_at);

CONCURRENTLY builds the index without taking a write lock on the table, so it is safe to run on a live system (it is slower and can't run inside a transaction block). Re-run the same EXPLAIN (ANALYZE, BUFFERS) and the plan flips:

Index Scan using idx_users_created_at on users
   (cost=0.42..32.86 rows=812 width=36)
   (actual time=0.028..0.391 rows=796 rows=796 loops=1)
  Index Cond: (created_at >= (now() - '7 days'::interval))
  Buffers: shared hit=12
Planning Time: 0.144 ms
Execution Time: 0.447 ms

From 8,334 buffers to 12, and from 143 ms to under half a millisecond. That is the entire game in miniature: the index let Postgres jump straight to the matching rows instead of reading the whole table. The mental model behind why this works — and why every index you add taxes your writes — is worth internalizing before you add indexes reflexively.

If you also only ever select columns that live in the index, Postgres can skip the table heap entirely and return an index-only scan, the fastest read it offers. A covering index makes that explicit:

-- Now SELECT id, email ... can be answered from the index alone.
CREATE INDEX idx_users_created_at_covering
  ON users (created_at) INCLUDE (id, email);

Index the columns you filter and join on, in the order you use them. A composite index on (tenant_id, created_at) serves WHERE tenant_id = ? AND created_at > ? and WHERE tenant_id = ?, but not WHERE created_at > ? alone — the leftmost column has to be in the predicate. This left-to-right rule explains most "why is my index ignored?" surprises.

The query pattern that no index can save: N+1

Indexes fix individual slow queries. They do nothing for the most common application-level performance bug, where the query is fast but you run it hundreds of times. An ORM that lazy-loads a relation inside a loop fires one query per row — the N+1 query problem:

SELECT * FROM posts WHERE author_id = 1;   -- the "1"
SELECT * FROM comments WHERE post_id = 10;  -- + N
SELECT * FROM comments WHERE post_id = 11;
SELECT * FROM comments WHERE post_id = 12;
... 200 more

Each query is sub-millisecond and perfectly indexed, so nothing looks wrong — yet the page takes a second because of round-trip latency multiplied by N. The fix is to ask for everything in one set-based query (a JOIN or WHERE post_id = ANY($1)), not to tune any single statement. No index makes 200 round trips fast; making it one round trip does.

The third lever: connections don't scale linearly

Suppose every query is indexed and N+1-free, and the database still falls over under load. The culprit is almost always connections. A PostgreSQL connection is backed by a full OS process with its own memory; a few dozen are cheap, but hundreds of concurrent connections spend more time contending for locks and CPU than doing work. Throughput goes down as connections climb past the core count.

The fix is a pooler — typically PgBouncer — that lets thousands of application clients multiplex onto a small, fixed set of real backends:

# pgbouncer.ini — transaction pooling: a backend is handed back to the
# pool at the end of each transaction, so a handful serve many clients.
[databases]
appdb = host=127.0.0.1 port=5432 dbname=appdb
 
[pgbouncer]
pool_mode = transaction
max_client_conn = 5000     # clients that may connect to PgBouncer
default_pool_size = 20     # real Postgres backends per database

Five thousand clients, twenty backends. A good starting point for default_pool_size is roughly ((core_count * 2) + effective_spindle_count) — small, deliberately. The instinct to raise it under load is exactly backwards; a smaller pool of busy connections beats a large pool of contending ones.

Best practices

  • Always EXPLAIN (ANALYZE, BUFFERS) before and after. A fix you cannot see in the plan is a guess. BUFFERS exposes how many pages were actually touched.
  • Keep statistics fresh. Let autovacuum run, and ANALYZE after big data changes. Most "the planner picked a dumb plan" problems are stale row estimates.
  • Index for the workload, not the schema. Add indexes for the queries you actually run; drop ones that aren't used (pg_stat_user_indexes shows usage). Every index slows writes and costs storage.
  • Prefer composite and covering indexes over many single-column ones when queries filter on several columns together.
  • Pool connections from day one. It is far easier to add PgBouncer early than to diagnose a connection storm at 2 a.m.
  • Watch pg_stat_statements to find the queries that cost the most aggregate time — optimize the top of that list, not whatever felt slow once.

Common mistakes

  • Tuning the server before the query. Bumping shared_buffers won't save a Seq Scan that an index would eliminate. Fix the plan first.
  • Adding indexes blindly. Each one taxes every INSERT/UPDATE/DELETE. An unused index is pure overhead.
  • Wrapping indexed columns in functions. WHERE lower(email) = $1 can't use a plain index on email — you need an expression index on lower(email).
  • Raising the connection limit under load. More connections past the core count make contention worse, not better. Pool instead.
  • Trusting EXPLAIN without ANALYZE. Estimates lie when statistics are stale; only ANALYZE shows what really happened.
  • SELECT * everywhere, which defeats index-only scans and ships columns you don't use.

Takeaways

  • Three levers, in order: read the plan with EXPLAIN (ANALYZE, BUFFERS), add the right index, pool connections. Most slowness lives in one of the three.
  • Rows Removed by Filter on a big table = missing index. That line is your fastest diagnostic.
  • Indexes fix one slow query; set-based queries fix N+1. They solve different problems — don't reach for an index when the bug is too many round trips.
  • Connections don't scale linearly. A small, busy pool beats a large, contending one.
  • Measure before and after, every time. A fix that doesn't change the plan didn't fix anything.

Go deeper in the Database cluster: the index mental model, index-only scans, N+1 queries, connection pooling, and when JSONB helps vs. bites.

FAQ

What is the fastest way to improve Postgres performance? Run EXPLAIN (ANALYZE, BUFFERS) on your slowest query and look for a Seq Scan with a high Rows Removed by Filter on a large table. Adding an index on the filtered column is usually the single biggest win, and it takes one line.

How do I read a Postgres query plan? Read it bottom-up: the innermost (most indented) nodes run first and feed their parents. Compare estimated vs. actual rows to spot bad estimates, and watch Buffers to see how many pages were touched. The official EXPLAIN documentation walks through each node type.

When should I add an index vs. rewrite the query? Add an index when a single query scans a large table to return few rows. Rewrite the query when you're running many small queries in a loop (N+1) — no index fixes round-trip count, only collapsing them into one set-based query does.

Why does my database slow down with more users even though queries are fast? Almost always connections. Each Postgres connection is a full process; past your core count they contend more than they work. A connection pooler like PgBouncer multiplexes many clients onto a few backends and restores throughput.

Do more indexes always make things faster? No. Indexes speed reads but slow every write and consume storage, because each one must be updated on INSERT/UPDATE/DELETE. Index for the queries you actually run and drop unused ones.

Conclusion

Postgres performance is not folklore. The database hands you a precise account of what it plans to do and what it actually did — your job is to read it, give the planner the indexes it needs to avoid scanning whole tables, stop your application from making needless round trips, and keep connection counts sane with a pooler. Work those three levers in order, measure before and after with EXPLAIN (ANALYZE, BUFFERS), and you will fix the overwhelming majority of slow-database problems on the hardware you already have.

For the canonical references, see the PostgreSQL docs on indexes, using EXPLAIN, and the number of database connections wiki page, plus the PgBouncer documentation for pooling configuration.

11 min read

Read next