Index for finding an element in a JSON array
jsonb
in Postgres 9.4+
The binary JSON data type jsonb
largely improves index options. You can now have a GIN index on a jsonb
array directly:
CREATE TABLE tracks (id serial, artists jsonb); -- !
CREATE INDEX tracks_artists_gin_idx ON tracks USING gin (artists);
No need for a function to convert the array. This would support a query:
SELECT * FROM tracks WHERE artists @> '[{"name": "The Dirty Heads"}]';
@>
being the jsonb
"contains" operator, which can use the GIN index. (Not for json
, only jsonb
!)
Or you use the more specialized, non-default GIN operator class jsonb_path_ops
for the index:
CREATE INDEX tracks_artists_gin_idx ON tracks
USING gin (artists jsonb_path_ops); -- !
Same query.
Currently jsonb_path_ops
only supports the @>
operator. But it's typically much smaller and faster. There are more index options, details in the manual.
If the column artists
only holds names as displayed in the example, it would be more efficient to store just the values as JSON text primitives and the redundant key can be the column name.
Note the difference between JSON objects and primitive types:
- Using indexes in json array in PostgreSQL
CREATE TABLE tracks (id serial, artistnames jsonb);
INSERT INTO tracks VALUES (2, '["The Dirty Heads", "Louis Richards"]');
CREATE INDEX tracks_artistnames_gin_idx ON tracks USING gin (artistnames);
Query:
SELECT * FROM tracks WHERE artistnames ? 'The Dirty Heads';
?
does not work for object values, just keys and array elements.
Or:
CREATE INDEX tracks_artistnames_gin_idx ON tracks
USING gin (artistnames jsonb_path_ops);
Query:
SELECT * FROM tracks WHERE artistnames @> '"The Dirty Heads"'::jsonb;
More efficient if names are highly duplicative.
json
in Postgres 9.3+
This should work with an IMMUTABLE
function:
CREATE OR REPLACE FUNCTION json2arr(_j json, _key text)
RETURNS text[] LANGUAGE sql IMMUTABLE AS
'SELECT ARRAY(SELECT elem->>_key FROM json_array_elements(_j) elem)';
Create this functional index:
CREATE INDEX tracks_artists_gin_idx ON tracks
USING gin (json2arr(artists, 'name'));
And use a query like this. The expression in the WHERE
clause has to match the one in the index:
SELECT * FROM tracks
WHERE '{"The Dirty Heads"}'::text[] <@ (json2arr(artists, 'name'));
Updated with feedback in comments. We need to use array operators to support the GIN index.
The "is contained by" operator <@
in this case.
Notes on function volatility
You can declare your function IMMUTABLE
even if json_array_elements()
isn't wasn't.
Most JSON
functions used to be only STABLE
, not IMMUTABLE
. There was a discussion on the hackers list to change that. Most are IMMUTABLE
now. Check with:
SELECT p.proname, p.provolatile
FROM pg_proc p
JOIN pg_namespace n ON n.oid = p.pronamespace
WHERE n.nspname = 'pg_catalog'
AND p.proname ~~* '%json%';
Functional indexes only work with IMMUTABLE
functions.
Add and use index for jsonb with nested arrays
You already have a very good index to support your query.
Make use of it with the jsonb
"contains" operator" @>
:
SELECT *
FROM my_table
WHERE marc->'dynamicFields' @> '[{"name": "200", "subfields":[{"name": "a"}]}]';
db<>fiddle here
Carefully match the structure of the JSON object in the table. Then rows are selected cheaply using the index.
You can then extract whatever parts you need from qualifying rows.
Detailed instructions:
- Index for finding an element in a JSON array
If one of the filters is very selective on its own, it might be faster to split the two conditions like in your original. Either way, both variants should be fast:
SELECT *
FROM my_table
WHERE marc->'dynamicFields' @> '[{"name": "200"}]'
AND marc->'dynamicFields' @> '[{"subfields":[{"name": "a"}]}]';
Using indexes in json array in PostgreSQL
This specific jsonb example from the original answer was missing the array layer []
around the non-primitive object for the containment query. It has since been fixed.
The behavior documented for PostgreSQL 9.4.x jsonb Containment and Existence states:
The general principle is that the contained object must match the containing object as to structure and data contents
...
As a special exception to the general principle that the structures must match, an array may contain a primitive value
The special exception allows us to do the following:
CREATE TABLE tracks (id serial, artistnames jsonb);
CREATE INDEX tracks_artistnames_gin_idx ON tracks USING gin (artistnames);
INSERT INTO tracks (id, artists) VALUES (1, '["blink-182"]');
INSERT INTO tracks (id, artists) VALUES (2, '["The Dirty Heads", "Louis Richards"]');
We can query for containment using the general principle:
SELECT * FROM tracks WHERE artistnames @> '["The Dirty Heads"]';
id | artistnames
----+---------------------------------------
2 | ["The Dirty Heads", "Louis Richards"]
(1 row)
We can also query for containment using the special exception since the array contains primitive types:
SELECT * FROM tracks WHERE artistnames @> '"The Dirty Heads"';
id | artistnames
----+---------------------------------------
2 | ["The Dirty Heads", "Louis Richards"]
(1 row)
There are 4 primitive types that allow containment and existence queries on arrays to work:
- string
- number
- boolean
- null
Since the example you mentioned in your question is dealing with objects nested inside an array, we don't qualify for the special exception mentioned above:
CREATE TABLE tracks (id serial, artists jsonb);
CREATE INDEX tracks_artists_gin_idx ON tracks USING gin (artists);
INSERT INTO tracks (id, artists) VALUES (1, '[{"name": "blink-182"}]');
INSERT INTO tracks (id, artists) VALUES (2, '[{"name": "The Dirty Heads"}, {"name": "Louis Richards"}]');
We can query for containment using the general principle:
SELECT * FROM tracks WHERE artists @> '[{"name": "The Dirty Heads"}]';
id | artists
----+-----------------------------------------------------------
2 | [{"name": "The Dirty Heads"}, {"name": "Louis Richards"}]
(1 row)
Objects are not considered a primitive type, so the following query for containment does not qualify for the special exception and therefore does not work:
SELECT * FROM tracks WHERE artists @> '{"name": "The Dirty Heads"}';
id | artists
----+---------
(0 rows)
Indexing a jsonb array in Postgres
There is a general rule that you have to use exactly the same expression both in an index and a query to use the index. With this index:
CREATE INDEX idx_aip_tags
ON t_aip
USING gin ((json_aip#>'{properties,pdi,contextInformation,tags}'));
the query will use the index
EXPLAIN ANALYZE
SELECT count(*)
FROM t_aip
WHERE json_aip#>'{properties,pdi,contextInformation,tags}' ?| array['SOME_TAG']
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=149.97..149.98 rows=1 width=0) (actual time=27.783..27.783 rows=1 loops=1)
-> Bitmap Heap Scan on t_aip (cost=20.31..149.87 rows=40 width=0) (actual time=1.504..25.726 rows=20000 loops=1)
Recheck Cond: ((json_aip #> '{properties,pdi,contextInformation,tags}'::text[]) ?| '{SOME_TAG}'::text[])
Heap Blocks: exact=345
-> Bitmap Index Scan on idx_aip_tags (cost=0.00..20.30 rows=40 width=0) (actual time=1.455..1.455 rows=20000 loops=1)
Index Cond: ((json_aip #> '{properties,pdi,contextInformation,tags}'::text[]) ?| '{SOME_TAG}'::text[])
Note that the GIN index supports also @>
operator:
SELECT count(*)
FROM t_aip
WHERE json_aip#>'{properties,pdi,contextInformation,tags}' @> '["SOME_TAG"]'
but be careful when searching for more than one tag:
SELECT count(*)
FROM t_aip
-- this gives objects containing both tags:
-- WHERE json_aip#>'{properties,pdi,contextInformation,tags}' @> '["SOME_TAG", "ANOTHER_TAG"]'
-- and this gives objects with any of two tags:
WHERE json_aip#>'{properties,pdi,contextInformation,tags}' @> ANY(ARRAY['["SOME_TAG"]', '["ANOTHER_TAG"]']::jsonb[])
Index JSON Array in Postgres DB
Please check JSONB
Indexing section in Postgres documentation.
For your case index config may be the following:
CREATE INDEX idx_gin_wishlist ON your_table USING gin ((jsonb_column -> 'wish_list'));
It will store copies of every key and value inside wish_list
, but you should be careful with a query which hits the index. You should use @>
operator:
SELECT jsonb_column->'wish_list'
FROM your_table WHERE jsonb_column->'wish_list' @> '[{"present_link": "www.amazon.com", "present_name": "Counting Crows"}]';
Strongly suggested to check existing nswers:
- How to query for array elements inside JSON type
- Index for finding an element in a JSON array
Related Topics
Redshift: Executing a Dynamic Query from a String
Re-Use Aliased Field in SQL Select Statement
How to Re-Use Result for Select, Where and Order by Clauses
Summarize the List into a Comma-Separated String
Ora-00904 Invalid Identifier on Decode Alias
SQL Server: Order by Parameters in In Statement
How to Find 11Th Entry in SQL Access Database Table
Cascade on Delete or Use Triggers
SQL Server 2008 - Conditional Query
SQL - Safely Downcast Bigint to Int
Unique Constraint Controlled by a Bit Column
Transpose Select Results with Oracle
Getting an Error When Executing a Dynamic SQL Within a Function (SQL Server)
Find Only Capital Letters in Word Through in SQL Server Query