Select (Retrieve) All Records from Multiple Schemas Using Postgres

Select (retrieve) all records from multiple schemas using Postgres

With inheritance like @Denis mentioned, this would be very simple. Works for Postgres 8.4, too. Be sure to consider the limitations.

Basically, you would have a master table, I suppose in a master schema:

CREATE TABLE master.product (title text);

And all other tables in various schemata inherit from it, possibly adding more local columns:

CREATE TABLE a.product (product_id serial PRIMARY KEY, col2 text)
INHERITS (master.product);

CREATE TABLE b.product (product_id serial PRIMARY KEY, col2 text, col3 text)
INHERITS (master.product);

etc.

The tables don't have to share the same name or schema.

Then you can query all tables in a single fell swoop:

SELECT title, tableoid::regclass::text AS source
FROM master.product
WHERE title ILIKE '%test%';

tableoid::regclass::text?

That's a handy way to tell the source of each row. Details:

  • Find out which schema based on table values

SQL Fiddle.

Can I select data across multiple schemas within the same SQL database?

If all tables have an identical structure, you can write a PL/pgSQL function that does this:

create function get_info(p_schema_prefix text)
returns table (... column definitions go here ...)
as
$$
declare
l_rec record;
l_sql text;
begin
for l_rec in select table_schema, table_name
from information_schema.tables
where table_name = 'info'
and table_schema like p_schema_prefix||'%'
loop
l_sql := format('select id, data from %I.%I', l_rec.table_schema, l_rec.table_name);
return query execute l_sql;
end loop;
end;
$$
language plpgsql;

The use it like this:

select *
from get_info('team')

postgres select across all schemas

If your tables have different names (what should be a rule in most cases) set search path:

select * from foo.table1, bar.table2, baz.table3;

set search_path = foo, bar, baz;
select * from table1, table2, table3;

Important tip: Use explicit joins instead of listing tables in from clause.

select * 
from table1
join table2 on ...
join table3 on ...

If you have tables with the same name in several schemas you can refer to them in a function. You need information from system catalogs:

  • pg_class - catalog containing information about all tables (and other relations) in a database,
  • pg_namespace - catalog containing information about all schemas in a database.

This query lists all tables with given_table_name in a database:

select 
n.nspname, c.relname
from
pg_class c
join
pg_namespace n on n.oid = c.relnamespace
where
c.relkind = 'r'
and c.relname = 'given_table_name';

The function below:

  • finds all tables with given name in pg_class,
  • looks up to pg_namespace to find schema names of tables found above,
  • execute a function for all tables with given name adding schema name as prefix.

Assuming you have defined function do_something_with_this_table(tablename regclass) earlier:

create function do_something_with_all_these_tables(tablename text)
returns void language plpgsql
as $$
declare
schemaname text;
begin
for schemaname in
select n.nspname
from pg_class c
join pg_namespace n on n.oid = c.relnamespace
where c.relkind = 'r' and c.relname = tablename
loop
execute format(
'select do_something_with_this_table(''%s.%s'')',
schemaname, tablename);
end loop;
end $$;

select do_something_with_all_these_tables('given_table_name');

Read more:

  • about plpgsql language
  • about format() function

Running query against all schemas in Postgres

You can define a Postgres procedure that uses dynamic commands, e.g.:

create or replace procedure clear_tenants()
language plpgsql as $function$
declare
tenant text;
begin
for tenant in
select tenant_schema
from public.tenant_schema_mappings
loop
execute format($ex$
delete from %I.parent
where expiration_date_time < now()
$ex$, tenant);
end loop;
end
$function$;

Then all your cron task has to do is to call the procedure:

call clear_tenants()

In Postgres 10 or earlier use a function or do block instead of procedure.


The main disadvantage of this simple solution is that all the stuff is executed in a single transaction. Unfortunately, you cannot control transactions in procedures containing dynamic queries. I would define chunk_number in the table describing schemas and call the procedure for each chunk in its own transaction.

create or replace procedure public.clear_tenants(chunk integer)
language plpgsql as $function$
declare
tenant text;
begin
for tenant in
select tenant_schema
from public.tenant_schema_mappings
where chunk_number = chunk
loop
execute format($ex$
delete from %I.parent
where expiration_date_time < now()
$ex$, tenant);
end loop;
end
$function$;

On the client side I would have to prepare a script in the format:

-- in psql the procedures will be executed in separate transactions
-- if you do not use begin; explicitly
call clear_tenants(1);
call clear_tenants(2);
call clear_tenants(3);
...

or execute many instances of psql for individual chunks (each in a separate connection). The last option is practically the only way to enforce concurrency. It is, of course, limited by a reasonable number of concurrent connections.


The following function emits notices with number of deleted rows from each tenant and returns the total number of deleted rows for a chunk:

create or replace function public.clear_tenants_modified(chunk integer)
returns bigint language plpgsql as $function$
declare
tenant text;
deleted_rows bigint;
total_deleted_rows bigint = 0;
begin
for tenant in
select tenant_schema
from public.tenant_schema_mappings
where chunk_number = chunk
loop
execute format($ex$
with delete_statement as (
delete from %I.parent
where expiration_date_time < now()
returning 1 as x)
select count(x)
from delete_statement
$ex$, tenant)
into deleted_rows;
raise notice '%: %', tenant, deleted_rows;
total_deleted_rows = total_deleted_rows+ deleted_rows;
end loop;
return total_deleted_rows;
end
$function$;

select clear_tenants_modified(1);

Postgres SQL query across different schemas

You will need some plpgsql and dynamic SQL for this. Here is an anonymous block for illustration:

do language plpgsql
$$
declare
v_schema_name text;
table_row_count bigint;
sysSchema text[] := array['pg_toast','pg_temp_1','pg_toast_temp_1','pg_catalog','public','information_schema'];
-- other declarations here
begin
for v_schema_name in SELECT schema_name FROM information_schema.schemata WHERE (schema_name != ALL(sysSchema)) loop
begin
execute format('select count(col_x) from %I.t_table', v_schema_name)
into table_row_count;
raise notice 'Schema % count %', v_schema_name, table_row_count;
exception when others then null; -- t_table may not exists in some schemata
end;
-- other statements here
end loop;
end;
$$;

And btw WHERE col_x is not NULL is redundant.

Run Query on All Schemas Postgres

The following catalog query will produce valid queries for every table on all schemas of your database. You can copy this to a valid SQL file.

SELECT 'SELECT * FROM ' || table_schema || '.' || table_name || ';' AS query 
FROM information_schema.tables
WHERE table_schema IN
(
SELECT schema_name
FROM information_schema.schemata
WHERE schema_name NOT LIKE 'pg_%' AND schema_name != 'information_schema'
);

Does this help?

Postgresql 9.1 select from all schemas

Different schemas mean different tables, so if you have to stick to this structure, it'll mean unions, one way or the other. That can be pretty expensive. If you're after partitioning through the convenience of search paths, it might make sense to reverse your schema:

Store a big table in the public schema, and then provision views in each of the individual schemas.

Check out this sqlfiddle that demonstrates my concept:

http://sqlfiddle.com/#!12/a326d/1

Also pasted inline for posterity, in case sqlfiddle is inaccessible:

Schema:

CREATE SCHEMA customer_1;
CREATE SCHEMA customer_2;

CREATE TABLE accounts(id serial, name text, value numeric, customer_id int);
CREATE INDEX ON accounts (customer_id);

CREATE VIEW customer_1.accounts AS SELECT id, name, value FROM public.accounts WHERE customer_id = 1;
CREATE VIEW customer_2.accounts AS SELECT id, name, value FROM public.accounts WHERE customer_id = 2;

INSERT INTO accounts(name, value, customer_id) VALUES('foo', 100, 1);
INSERT INTO accounts(name, value, customer_id) VALUES('bar', 100, 1);
INSERT INTO accounts(name, value, customer_id) VALUES('biz', 150, 2);
INSERT INTO accounts(name, value, customer_id) VALUES('baz', 75, 2);

Queries:

SELECT SUM(value) FROM public.accounts;

SET search_path TO 'customer_1';
SELECT * FROM accounts;

SET search_path TO 'customer_2';
SELECT * FROM accounts;

Results:

425

1   foo     100
2 bar 100

3   biz     150
4 baz 75

Querying across schemas in Postgres

Yes. And the syntax is exactly how you wrote.



Related Topics



Leave a reply



Submit