Check If Value Exists in Postgres Array

Check if value exists in Postgres array

Simpler with the ANY construct:

SELECT value_variable = ANY ('{1,2,3}'::int[])

The right operand of ANY (between parentheses) can either be a set (result of a subquery, for instance) or an array. There are several ways to use it:

  • SQLAlchemy: how to filter on PgArray column types?
  • IN vs ANY operator in PostgreSQL

Important difference: Array operators (<@, @>, && et al.) expect array types as operands and support GIN or GiST indices in the standard distribution of PostgreSQL, while the ANY construct expects an element type as left operand and does not support these indices. Example:

  • Index for finding an element in a JSON array

None of this works for NULL elements. To test for NULL:

  • Check if NULL exists in Postgres array

Postgres: check if array field contains value?

This should work:

select * from mytable where 'Journal'=ANY(pub_types);

i.e. the syntax is <value> = ANY ( <array> ). Also notice that string literals in postresql are written with single quotes.

Check if NULL exists in Postgres array

Postgres 9.5 or later

Or use array_position(). Basically:

SELECT array_position(arr, NULL) IS NOT NULL AS array_has_null

See demo below.

Postgres 9.3 or later

You can test with the built-in functions array_remove() or array_replace().

Postgres 9.1 or any version

If you know a single element that can never exist in your arrays, you can use this fast expression. Say, you have an array of positive numbers, and -1 can never be in it:

-1 = ANY(arr) IS NULL

Related answer with detailed explanation:

  • Is array all NULLs in PostgreSQL

If you cannot be absolutely sure, you could fall back to one of the expensive but safe methods with unnest(). Like:

(SELECT bool_or(x IS NULL) FROM unnest(arr) x)

or:

EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)

But you can have fast and safe with a CASE expression. Use an unlikely number and fall back to the safe method if it should exist. You may want to treat the case arr IS NULL separately. See demo below.

Demo

SELECT num, arr, expect
, -1 = ANY(arr) IS NULL AS t_1 -- 50 ms
, (SELECT bool_or(x IS NULL) FROM unnest(arr) x) AS t_2 -- 754 ms
, EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL) AS t_3 -- 521 ms
, CASE -1 = ANY(arr)
WHEN FALSE THEN FALSE
WHEN TRUE THEN EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)
ELSE NULLIF(arr IS NOT NULL, FALSE) -- catch arr IS NULL -- 55 ms
-- ELSE TRUE -- simpler for columns defined NOT NULL -- 51 ms
END AS t_91
, array_replace(arr, NULL, 0) <> arr AS t_93a -- 99 ms
, array_remove(arr, NULL) <> arr AS t_93b -- 96 ms
, cardinality(array_remove(arr, NULL)) <> cardinality(arr) AS t_94 -- 81 ms
, COALESCE(array_position(arr, NULL::int), 0) > 0 AS t_95a -- 49 ms
, array_position(arr, NULL) IS NOT NULL AS t_95b -- 45 ms
, CASE WHEN arr IS NOT NULL
THEN array_position(arr, NULL) IS NOT NULL END AS t_95c -- 48 ms
FROM (
VALUES (1, '{1,2,NULL}'::int[], true) -- extended test case
, (2, '{-1,NULL,2}' , true)
, (3, '{NULL}' , true)
, (4, '{1,2,3}' , false)
, (5, '{-1,2,3}' , false)
, (6, NULL , null)
) t(num, arr, expect);

Result:

 num |  arr        | expect | t_1    | t_2  | t_3 | t_91 | t_93a | t_93b | t_94 | t_95a | t_95b | t_95c
-----+-------------+--------+--------+------+-----+------+-------+-------+------+-------+-------+-------
1 | {1,2,NULL} | t | t | t | t | t | t | t | t | t | t | t
2 | {-1,NULL,2} | t | f --!! | t | t | t | t | t | t | t | t | t
3 | {NULL} | t | t | t | t | t | t | t | t | t | t | t
4 | {1,2,3} | f | f | f | f | f | f | f | f | f | f | f
5 | {-1,2,3} | f | f | f | f | f | f | f | f | f | f | f
6 | NULL | NULL | t --!! | NULL | f | NULL | NULL | NULL | NULL | f | f | NULL

Note that array_remove() and array_position() are not allowed for multi-dimensional arrays. All expressions to the right of t_93a only work for 1-dimenstioal arrays.

db<>fiddle here - Postgres 13, with more tests

Old sqlfiddle

Benchmark setup

The added times are from a benchmark test with 200k rows in Postgres 9.5. This is my setup:

CREATE TABLE t AS
SELECT row_number() OVER() AS num
, array_agg(elem) AS arr
, bool_or(elem IS NULL) AS expected
FROM (
SELECT CASE WHEN random() > .95 THEN NULL ELSE g END AS elem -- 5% NULL VALUES
, count(*) FILTER (WHERE random() > .8)
OVER (ORDER BY g) AS grp -- avg 5 element per array
FROM generate_series (1, 1000000) g -- increase for big test case
) sub
GROUP BY grp;

Function wrapper

For repeated use, I would create a function in Postgres 9.5 like this:

CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
RETURNS bool
LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT array_position($1, NULL) IS NOT NULL';

PARALLEL SAFE only for Postgres 9.6 or later.

Using a polymorphic input type this works for any array type, not just int[].

Make it IMMUTABLE to allow performance optimization and index expressions.

  • Does PostgreSQL support "accent insensitive" collations?

But don't make it STRICT, which would disable "function inlining" and impair performance because array_position() is not STRICT itself. See:

  • Function executes faster without STRICT modifier?

If you need to catch the case arr IS NULL:

CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
RETURNS bool
LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT CASE WHEN $1 IS NOT NULL
THEN array_position($1, NULL) IS NOT NULL END';

For Postgres 9.1 use the t_91 expression from above. The rest applies unchanged.

Closely related:

  • How to determine if NULL is contained in an array in Postgres?

Check if a String exists in Postgres JSONB array

that's what the ? operator will do:

select *
from the_table
where numbers ? '16516516'

Online example

How to check if all array elements exists in set of data in table

If you want one row per door_id whose set of available_id is exactly the same as the given array, you can use aggregation like so:

select door_id
from mytable
group by door_id
having array_agg(available_id order by available_id) = array[1,2,3,4,6,52]

If you are fine with subset containment:

having array_agg(available_id order by available_id) @> array[1,2,3,4,6,52]

If you want all corresponding rows, use window aggregation:

select *
from (
select t.*,
array_agg(available_id order by available_id) over(partition by door_id) arr
from mytable t
) t
where arr @> array[1,2,3,4,6,52]

Postgres index type for array column for ANY queries

Here is a detailed explanation why the ANY construct with the indexed column to the right cannot tap into a GIN index (or any index, for that matter):

  • Can PostgreSQL index array columns?

But array operators can. See:

  • Check if value exists in Postgres array

To force a test with a small table, you can disable (massively discourage, really) sequential scans in your current session with:

SET enable_seqscan = OFF;

See:

  • Postgres query optimization (forcing an index scan)


Related Topics



Leave a reply



Submit