Postgres 9.4 JSONb Array as Table

Postgres 9.4 jsonb array as table

Query

Your table definition is missing. Assuming:

CREATE TABLE configuration (
  config_id serial PRIMARY KEY
, config jsonb NOT NULL
);

To find the a value and its row for given oid and instance:

SELECT c.config_id, d->>'value' AS value
FROM   configuration c
     , jsonb_array_elements(config->'data') d  -- default col name is "value"
WHERE  d->>'oid'      = '1.3.6.1.4.1.7352.3.10.2.5.35.3'
AND    d->>'instance' = '0'
AND    d->>'value'   <> '1';

That's an implicit LATERAL join. Compare:

Query for array elements inside JSON type

What is the fastest way to get a table with 3 columns of oid, instance and value.

I suppose to use jsonb_populate_recordset(), then you can provide data types in the table definition. Assuming text for all:

CREATE TEMP TABLE data_pattern (oid text, value text, instance text);

Could also be a persisted (non-temp) table. This one is only for the current session. Then:

SELECT c.config_id, d.*
FROM   configuration c
     , jsonb_populate_recordset(NULL::data_pattern, c.config->'data') d;

That's all. The first query rewritten:

SELECT c.config_id, d.*
FROM   configuration c
     , jsonb_populate_recordset(NULL::data_pattern, c.config->'data') d
WHERE  d.oid      = '1.3.6.1.4.1.7352.3.10.2.5.35.3'
AND    d.instance = '0'
AND    d.value   <> '1';

But that's slower than the first query. Key to performance with bigger table is index support:

Index

You could easily index the normalized (translated) table or the alternative layout you proposed in the question. Indexing your current layout is not as obvious, but also possible. For best performance I suggest a functional index on just the data key with the jsonb_path_ops operator class. Per documentation:

The technical difference between a jsonb_ops and a jsonb_path_ops GIN
index is that the former creates independent index items for each key
and value in the data, while the latter creates index items only for
each value in the data.

This should work wonders for performance:

CREATE INDEX configuration_my_idx ON configuration
USING gin ((config->'data') jsonb_path_ops);

One might expect that only a complete match for a JSON array element would work, like:

SELECT * FROM configuration
WHERE  (config->'data') @> '[{"oid": "1.3.6.1.4.1.7352.3.10.2.5.35.3"
                            , "instance": "0", "value": "1234"}]';

Note the JSON array notation (with enclosing []) of the provided value: that's required.

But array elements with a subset of keys work as well:

SELECT * FROM configuration
WHERE  (config->'data') @> '[{"oid": "1.3.6.1.4.1.7352.3.10.2.5.35.3"
                            , "instance": "0"}]'

The hard part is to incorporate your added predicate value <> '1'. Care must be taken to apply all predicates to the same array element. You could combine this with the first query:

SELECT c.*, d->>'value' AS value
FROM   configuration c
     , jsonb_array_elements(config->'data') d
WHERE  (config->'data') @> '[{"oid": "1.3.6.1.4.1.7352.3.10.2.5.35.3", "instance": "0"}]'
AND    d->>'oid'      = '1.3.6.1.4.1.7352.3.10.2.5.35.3'  -- must be repeated
AND    d->>'instance' = '0'                               -- must be repeated
AND    d->>'value'   <> '1'                               -- here we can rule out

Voilá.

Special index

If your table is huge, index size may be a deciding factor. You could compare performance of this special solution with a functional index:

This function extracts a Postgres array of oid-instance combinations from a given jsonb value:

CREATE OR REPLACE FUNCTION f_config_json2arr(_j jsonb)
  RETURNS text[]
  LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
$func$
SELECT ARRAY(
   SELECT (elem->>'oid') || '-' || (elem->>'instance')
   FROM   jsonb_array_elements(_j) elem
   )
$func$;

We can build a functional index based on this:

CREATE INDEX configuration_conrfig_special_idx ON configuration
USING  gin (f_config_json2arr(config->'data'));

And base the query on it:

SELECT * FROM configuration
WHERE  f_config_json2arr(config->'data') @> '{1.3.6.1.4.1.7352.3.10.2.5.35.3-0}'::text[]

The idea is that the index should be substantially smaller because it only stores the combined values without keys. The array containment operator @> itself should perform similar to the jsonb containment operator @>. I don't expect a big difference, but I would be very interested which is faster.

Similar to the first solution in this related answer (but more specialized):

Index for finding an element in a JSON array

Asides

I would not use oid as column name since that's also used for internal purposes in Postgres.
If possible, I would use a plain, normalized table without JSON.

PostgreSQL 9.4 expand jsonb int array into table with row numbers

This seems to solve the problem (thanks, Bruno), but it seems like more code than should be necessary?

WITH x AS (SELECT 'name3' as aname, jsonb_array_elements(column1->'name3') AS some_value FROM table1)
SELECT x.*, row_number() OVER () FROM x;

Anyone have a better solution?

How to flatten a jsonb array and return it in a table?

If you are using Postgres 12 or later, you can use a JSON path function:

select qualif_id,
       jsonb_path_query_array(qualif_assessment_formats, '$[*].af_branch')
from the_table;

How to break out jsonb array into rows for a postgresql query

You need to define datatypes:

SELECT
       jobs.id,
       templates.Id,
       templates.Version,
       templates.StepGroupId,
       templates.PublicVersion,
       templates.PlannedDataSheetIds,
       templates.SnapshottedDataSheetValues
FROM jobs,
     jsonb_to_recordset(jobs.source_templates) 
         AS templates(Id UUID, Version INT, StepGroupId UUID, PublicVersion INT,
                      PlannedDataSheetIds INT, SnapshottedDataSheetValues INT)

How to turn a json array into rows in postgres

I post the answer originally written by pozs in the comment section.

unnest() is for PostgreSQL's array types.

Instead one of the following function can be used:

json_array_elements(json) (9.3+)
jsonb_array_elements(jsonb) (9.4+)
json[b]_array_elements_text(json[b]) (9.4+)

Example:

select * from json_array_elements('[1,true, [2,false]]')

output value

 -------------
 | 1         |
 -------------
 | true      |
 -------------
 | [2,false] |
 -------------

Here where the documentation for v9.4 can be found.

Join Postgresql tables on json array columns

Actually, with the data type jsonb in Postgres 9.4 or later, this becomes dead simple. Your query would just work (ugly naming convention, code and duplicate names in the output aside).

CREATE TEMP TABLE table_a(a_id int, a int, b jsonb);
INSERT INTO table_a VALUES
  (1, 36464, '["874746", "474657"]')
, (2, 36465, '["874748"]')
, (3, 36466, '["874736", "474654"]');

CREATE TEMP TABLE table_b(b_id int, name text, b jsonb);
INSERT INTO table_b VALUES
  (1, 'john' , '["8740246", "2474657"]')
, (2, 'mary' , '["874748","874736"]')
, (3, 'clara', '["874736", "474654"]');

Query:

SELECT a_id, a, b.*
FROM   table_a a
JOIN   table_b b USING (b);  -- match on the whole jsonb column

That you even ask indicates you are using the data type json, for which no equality operator exists:

How to query a json column for empty objects?

You just didn't mention the most important details.

The obvious solution is to switch to jsonb.

Answer to your comment

is it possible to flatten out b into new rows rather than an array?

Use jsonb_array_elements(jsonb) or jsonb_array_elements_text(jsonb) in a LATERAL join:

SELECT a_id, a, b.b_id, b.name, b_array_element
FROM   table_a a
JOIN   table_b b USING (b)
     , jsonb_array_elements_text(b) b_array_element

This returns only rows matching on the whole array. About LATERAL:

What is the difference between LATERAL and a subquery in PostgreSQL?

If you want to match on array elements instead, unnest your arrays before you join.

The whole setup seems to be in dire need of normalization.

Postgres 9.4 JSONb Array as Table