Postgres JSONB: When to Use It and When It Bites
Photo: Unsplash
PostgreSQL's JSONB type is one of those features that feels like a superpower the first week and a liability the first time you have to migrate it. It lets you drop semi-structured data into a relational database without designing a schema, query into it with real operators, and even index it. Used well, it is genuinely excellent. Used as a substitute for thinking about your data model, it quietly becomes the thing that bites you eighteen months later.
The skill is not "JSONB good" or "JSONB bad." It is knowing which of your columns deserve the flexibility and which are columns pretending to be JSON. Let us draw that line clearly.
JSON vs JSONB: pick JSONB
First, a quick disambiguation, because Postgres has two JSON types. json stores an exact text copy of the input, preserving whitespace and key order, and re-parses it on every access. jsonb stores a decomposed binary form: parsing happens once on write, and reads are fast. jsonb also supports indexing and a rich operator set.
For virtually all application use, you want jsonb. The only real reason to reach for json is if you must preserve the exact textual representation, which is rare. The PostgreSQL JSON types documentation lays out the distinction in full.
CREATE TABLE products (
id bigserial PRIMARY KEY,
sku text NOT NULL,
name text NOT NULL,
price_cents int NOT NULL,
attributes jsonb NOT NULL DEFAULT '{}'
);
INSERT INTO products (sku, name, price_cents, attributes) VALUES
('TS-001', 'Wool Coat', 24900,
'{"color": "charcoal", "sizes": ["S","M","L"], "waterproof": true}');Where JSONB genuinely shines
JSONB earns its place when the data is legitimately variable in shape, or when imposing columns would be premature or impossible:
- Heterogeneous attributes. A
productstable where each category has different fields: coats havewaterproof, mugs havevolume_ml. Modeling every possible attribute as a column gives you a sparse mess; oneattributes jsonbcolumn handles them all. - External payloads you store as-is. Webhook bodies, third-party API responses, audit snapshots. The shape is owned by someone else and may change without warning.
- User-defined or plugin data. Custom form fields, feature flags, per-tenant settings where you cannot predict the keys.
In all three, the defining trait is the same: the structure is genuinely unknown, variable, or not yours to control. That is the right home for JSONB.
-- Real querying into the document, not just blob storage
SELECT sku, name
FROM products
WHERE attributes ->> 'color' = 'charcoal' -- ->> extracts text
AND attributes @> '{"waterproof": true}'; -- @> containsThe ->>, ->, @>, and ? operators make JSONB a queryable structure, not an opaque blob. That is exactly what separates it from stuffing a JSON string into a text column.
Indexing JSONB: the GIN index
The objection "but querying inside JSON must be slow" is answered by the GIN index. A GIN (Generalized Inverted Index) on a jsonb column lets containment and key-existence queries use an index instead of scanning every row and parsing every document.
-- Index the whole document for containment (@>) and key (?) queries
CREATE INDEX idx_products_attributes
ON products USING gin (attributes);
EXPLAIN ANALYZE
SELECT sku FROM products
WHERE attributes @> '{"waterproof": true}'; QUERY PLAN
------------------------------------------------------------------
Bitmap Heap Scan on products (cost=12.00..156.21 rows=48 width=12)
Recheck Cond: (attributes @> '{"waterproof": true}'::jsonb)
-> Bitmap Index Scan on idx_products_attributes
(cost=0.00..11.99 rows=48 width=0)
Index Cond: (attributes @> '{"waterproof": true}'::jsonb)
Planning Time: 0.180 ms
Execution Time: 0.402 msThat Bitmap Index Scan is the GIN index doing its job. There is also a more compact jsonb_path_ops GIN variant that is smaller and faster for containment queries at the cost of supporting fewer operators. The JSON indexing documentation covers both.
GIN indexes make JSONB reads fast, but they are larger and slower to update than B-tree indexes. On a write-heavy JSONB column, the index maintenance cost is real. Index the column because you query it often, not reflexively.
Where JSONB bites
Now the failure modes, because this is where teams get hurt.
1. Using JSONB for data that is obviously relational. If every row has the same keys and you constantly filter, join, and aggregate on them, those are columns wearing a JSON costume. You lose NOT NULL, foreign keys, type checking, and clean B-tree indexes, and you gain nothing. A price you sort by belongs in a typed column, not in attributes ->> 'price'.
2. No schema means no guarantees. The database will happily store {"color": "charcoal"} in one row and {"colour": 7} in the next. Typos, type drift, and missing keys are caught nowhere. With real columns, the database enforces shape; with JSONB, your application code becomes the only schema, and it is never as thorough as you think.
3. Awkward, fragile querying for relational access. Pulling a value out of JSON requires casting, and casts are easy to get wrong:
-- Sorting by a numeric value stored in JSON: note the explicit cast.
-- Forget it, and you sort "100" before "20" as text. Subtle and nasty.
SELECT sku FROM products
ORDER BY (attributes ->> 'price_cents')::int DESC;4. Painful migrations. Renaming a key or changing a value's type across millions of JSONB documents means a bulk UPDATE rewriting every row, with no column-level tooling to help. A real column rename is metadata; a JSON key rename is a data migration.
5. Update write amplification. Updating one key in a large JSONB document rewrites the entire document, because Postgres stores the value as a whole. Frequent small updates to big documents generate far more write and bloat than updating a narrow column would.
A practical decision rule
Ask one question of each piece of data: is this shape known and stable, and do I query or constrain it relationally?
- Yes to both: make it a real, typed column. You get integrity, clean indexing, and cheap migrations.
- No (the shape is variable, external, or user-defined): JSONB is the right tool, and a GIN index makes it fast.
The hybrid is usually best: promote the handful of fields you always filter and sort on into typed columns, and keep the long, variable tail in a jsonb column. That gives you relational integrity where it matters and document flexibility where it helps, instead of forcing everything into one model.
Takeaways
- Prefer
jsonboverjsonfor nearly all uses: it parses once, reads fast, and supports indexing and rich operators. - JSONB excels for genuinely variable, external, or user-defined data, where a fixed schema is premature or impossible.
- A GIN index makes containment (
@>) and key (?) queries fast, but costs more on writes than a B-tree, so index because you query, not by reflex. - It bites when used for stable relational data: you lose constraints, type safety, clean indexing, and easy migrations, and updates rewrite whole documents.
- The strong default is a hybrid model: typed columns for the fields you filter and sort on, a
jsonbcolumn for the variable tail.

