Intersection of multiple arrays in PostgreSQL
The closest thing to an array intersection that I can think of is this:
select array_agg(e)
from (
select unnest(a1)
intersect
select unnest(a2)
) as dt(e)
This assumes that a1
and a2
are single dimension arrays with the same type of elements. You could wrap that up in a function something like this:
create function array_intersect(a1 int[], a2 int[]) returns int[] as $$
declare
ret int[];
begin
-- The reason for the kludgy NULL handling comes later.
if a1 is null then
return a2;
elseif a2 is null then
return a1;
end if;
select array_agg(e) into ret
from (
select unnest(a1)
intersect
select unnest(a2)
) as dt(e);
return ret;
end;
$$ language plpgsql;
Then you could do things like this:
=> select array_intersect(ARRAY[2,4,6,8,10], ARRAY[1,2,3,4,5,6,7,8,9,10]);
array_intersect
-----------------
{6,2,4,10,8}
(1 row)
Note that this doesn't guarantee any particular order in the returned array but you can fix that if you care about it. Then you could create your own aggregate function:
-- Pre-9.1
create aggregate array_intersect_agg(
sfunc = array_intersect,
basetype = int[],
stype = int[],
initcond = NULL
);
-- 9.1+ (AFAIK, I don't have 9.1 handy at the moment
-- see the comments below.
create aggregate array_intersect_agg(int[]) (
sfunc = array_intersect,
stype = int[]
);
And now we see why array_intersect
does funny and somewhat kludgey things with NULLs. We need an initial value for the aggregation that behaves like the universal set and we can use NULL for that (yes, this smells a bit off but I can't think of anything better off the top of my head).
Once all this is in place, you can do things like this:
> select * from stuff;
a
---------
{1,2,3}
{1,2,3}
{3,4,5}
(3 rows)
> select array_intersect_agg(a) from stuff;
array_intersect_agg
---------------------
{3}
(1 row)
Not exactly simple or efficient but maybe a reasonable starting point and better than nothing at all.
Useful references:
array_agg
- create aggregate
- create function
- PL/pgSQL
unnest
Postgres WHERE two arrays have a non-empty intersection
You can use &&
, the array overlap operator:
select *
from foo
where tags && ARRAY['apples', 'bananas', 'cherries']
From the documentation:
&&
: overlap (have elements in common)
Finding the intersection between two integer arrays in postgres
The documentation you send is for an extension. To use it you have to run CREATE EXTENSION intarray
on your database so that those commands work. This will load that extension to the database and from then on you will be able to use it on all queries on that database.
You can read more about extensions here and how to load them here
Postgres Group by intersection array
Having
CREATE TABLE temp1
(
id int PRIMARY KEY,
items char[] NOT NULL
);
INSERT INTO temp1 VALUES
( '1', ARRAY['A', 'B'] ),
( '2', ARRAY['A', 'B', 'C'] ),
( '3', ARRAY['E', 'F'] ),
( '4', ARRAY['G'] );
--Indexing array field to speedup queries
CREATE INDEX idx_items on temp1 USING GIN ("items");
Then
select t1.*,
coalesce( (select t2.items from temp1 t2
where t2.items && t1.items
and t1.id != t2.id
and array_length(t2.items,1)<array_length(t1.items,1)
order by array_length(t2.items,1) limit 1 )/*minimum common*/
, t1.items /*trivial solution*/ ) group_alias
from temp1 t1;
https://www.db-fiddle.com/f/46ydeE5ZXCJDk4Rw3cu4jt/10
Intersect on two array_agg columns in the same row
Postgresql has LATERAL.
Which can be used to do something with the content of fields on record level.
create table mytable (day varchar(30), person varchar(1));
INSERT INTO mytable (day, person)
values
('Monday', 'A'),
('Monday', 'B'),
('Tuesday', 'A'),
('Thursday', 'B');
SELECT *
FROM (
select day as d1,
array_agg(distinct person) as agg1
from mytable
group by day) AS AA
cross join
(select day as d2,
array_agg(distinct person) as agg2
from mytable
group by day
) AS BB
CROSS JOIN LATERAL
(
SELECT COUNT(*) AS MatchingPersons
FROM
(
SELECT unnest(agg1) person
INTERSECT
SELECT unnest(agg2)
) q
) lat
d1 | agg1 | d2 | agg2 | matchingpersons
:------- | :---- | :------- | :---- | --------------:
Monday | {A,B} | Monday | {A,B} | 2
Thursday | {B} | Monday | {A,B} | 1
Tuesday | {A} | Monday | {A,B} | 1
Monday | {A,B} | Thursday | {B} | 1
Thursday | {B} | Thursday | {B} | 1
Tuesday | {A} | Thursday | {B} | 0
Monday | {A,B} | Tuesday | {A} | 1
Thursday | {B} | Tuesday | {A} | 0
Tuesday | {A} | Tuesday | {A} | 1
db<>fiddle here
Intersection of multiple text arrays: ERROR: array value must start with {
For the documentation:
initial_condition
The initial setting for the state value. This must be a string
constant in the form accepted for the data type state_data_type. If
not specified, the state value starts out null.
So the aggregate declaration should look like this:
CREATE AGGREGATE array_intersect_agg(
sfunc = array_intersect,
basetype = text[],
stype = text[]
);
Related Topics
How to Monitor and Log Actual Queries Made Against an Access Mdb
Why Do I Need to Explicitly Specify All Columns in a SQL "Group By" Clause - Why Not "Group by *"
Split Date Range into One Row Per Month in SQL Server
Update Table Based on Another Table
Database Normalization - Who's Right
SQL Query of Haversine Formula in SQL Server
Oracle: on Duplicate Key Update
Using Except Clause in Postgresql
Return a Grouped List with Occurrences Using Rails and Postgresql
How to Group and Choose Lowest Value in SQL
SQL Server: How to Get a Database Name as a Parameter in a Stored Procedure
Entity Framework Hitting 2100 Parameter Limit
Does Oracle Roll Back the Transaction on an Error
How to Find Out Whether a Table Has Some Unique Columns
How to Get Windows Log-In User Name for a SQL Log in User
Left Outer Join and an Additional Where Clause
Tsql: How to Retrieve the Last Date of Each Month Between Given Date Range