Is There a Postgres Command to List/Drop All Materialized Views

Is there a postgres command to list/drop all materialized views?

Pure SQL

Show all:

SELECT oid::regclass::text
FROM pg_class
WHERE relkind = 'm';

Names are automatically double-quoted and schema-qualified where needed according to your current search_path in the cast from regclass to text.

In the system catalog pg_class materialized views are tagged with relkind = 'm'.

The manual:

m = materialized view

To drop all, you can generate the needed SQL script with this query:

SELECT 'DROP MATERIALIZED VIEW ' || string_agg(oid::regclass::text, ', ') 
FROM pg_class
WHERE relkind = 'm';

Returns:

DROP MATERIALIZED VIEW mv1, some_schema_not_in_search_path.mv2, ...

One DROP MATERIALIZED VIEW statement can take care of multiple materialized views. You may need to add CASCADE at the end if you have nested views.

Inspect the resulting DDL script to be sure before executing it. Are you sure you want to drop all MVs from all schemas in the db? And do you have the required privileges to do so? (Currently there are no materialized views in a fresh standard installation.)

Meta command in psql

In the default interactive terminal psql, you can use the meta-command:

\dm

Executes this query on the server:

SELECT n.nspname as "Schema",
c.relname as "Name",
CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' WHEN 'p' THEN 'partitioned table' WHEN 'I' THEN 'partitioned index' END as "Type",
pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('m','')
AND n.nspname <> 'pg_catalog'
AND n.nspname <> 'information_schema'
AND n.nspname !~ '^pg_toast'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;

Which can be reduced to:

SELECT n.nspname as "Schema"
, c.relname as "Name"
, pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'm'
AND n.nspname <> 'pg_catalog'
AND n.nspname <> 'information_schema'
AND n.nspname !~ '^pg_toast'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;

How to drop materialized views using EXECUTE statement in PostgreSQL

The error message indicates that you have non-standard names created with double-quoting, like "Dimension" (mixed case). You need to quote and escape identifiers properly in dynamic SQL. Not only because it doesn't work any other way, also to avoid SQL injection.

Plus, you may have to schema-qualify names. Details:

  • Table name as a PostgreSQL function parameter

Also, you could drop multiple MV's at once and don't need a loop this way:

EXECUTE (
SELECT 'DROP MATERIALIZED VIEW ' || string_agg(oid::regclass::text, ', ')
FROM pg_class
WHERE relkind = 'm'
);

Careful with this! It drops all materialized views in all schemas of your current database. You may want to double-check first.

Note how I am using oid::regclass::text, not quote_ident(relname). That also covers the schema name automatically. Detailed explantaion in the provided link.

Dropping all views in postgreSql

you can select the views from the meta tables, like this, (the actual select may differ if you use older version, see here e.g. http://www.alberton.info/postgresql_meta_info.html)

SELECT 'DROP VIEW ' || table_name || ';'
FROM information_schema.views
WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
AND table_name !~ '^pg_';

So you fix this select according your actual version, run it, save the results into a .sql file, and run the .sql file.

Get count of all materialized views

1) After from a space is missing. So you are executing the query

SELECT count(*) frommy_view

instead of

SELECT count(*) from my_view

So there is an error. On my system a default int value returned (1)

2) Your RAISE NOTICE is outside the loop. So you are noticing only the very last query result. Put this into the loop body and it works.

DO $$
DECLARE
rec record;
my_pk_new integer;
BEGIN
FOR rec IN
SELECT matviewname
FROM pg_matviews
limit 2
LOOP
EXECUTE 'SELECT count(*) FROM ' || rec.matviewname
INTO my_pk_new;

RAISE NOTICE 'Calling (%)', my_pk_new;
END LOOP;
END;
$$
LANGUAGE plpgsql;

How to refresh all materialized views in Postgresql 9.3 at once?

Looks like current version of PostgreSQL (9.3.1) does not have such functionality, have had to write my own function instead:

CREATE OR REPLACE FUNCTION RefreshAllMaterializedViews(schema_arg TEXT DEFAULT 'public')
RETURNS INT AS $$
DECLARE
r RECORD;
BEGIN
RAISE NOTICE 'Refreshing materialized view in schema %', schema_arg;
FOR r IN SELECT matviewname FROM pg_matviews WHERE schemaname = schema_arg
LOOP
RAISE NOTICE 'Refreshing %.%', schema_arg, r.matviewname;
EXECUTE 'REFRESH MATERIALIZED VIEW ' || schema_arg || '.' || r.matviewname;
END LOOP;

RETURN 1;
END
$$ LANGUAGE plpgsql;

(on github: https://github.com/sorokine/RefreshAllMaterializedViews)

PostgreSQL - get materialized view column metadata

Queries for this kind of question can easily be retrieve when running psql with the -E ("echo hidden queries") option.

The following query should do what you want:

SELECT a.attname,
pg_catalog.format_type(a.atttypid, a.atttypmod),
a.attnotnull
FROM pg_attribute a
JOIN pg_class t on a.attrelid = t.oid
JOIN pg_namespace s on t.relnamespace = s.oid
WHERE a.attnum > 0
AND NOT a.attisdropped
AND t.relname = 'mv_name' --<< replace with the name of the MV
AND s.nspname = 'public' --<< change to the schema your MV is in
ORDER BY a.attnum;

Replacement for materialized view on PostgreSQL

  • Whenever there is a scope of growth , the best way to scale is to find a way to repeat a process on incremental data.
  • To explain this , we name the table that has been mentioned as 'Tab':
Tab  
Number ID CreationTime
Index on creationtime column.
  • Key to applying the incremental method is to have a monotonically increasing value.
    Here we have 'creationtime' for that.

(a) create another table Tab_duplicate with an additional column 'last_compute_timestamp'
Say:

Tab_duplicate
Number ID Duplicate_count last_compute_timestamp

(b) Create an index on column 'last_compute_timestamp'.

(c) Run the insert to find the duplicate records and insert it into Tab_duplicate along with the last_compute_timestamp.

(d) For repeat Execution:

  1. install extension pg_cron (if its not there) and automate this execution of insert.
    https://github.com/citusdata/pg_cron

https://fatdba.com/2021/07/30/pg_cron-probably-the-best-way-to-schedule-jobs-within-postgresql-database/

or
2. Use a shell script/python script to execute it on the DB through OS crontab.

The fact that last_compute_timestamp is recorded in every iteration and reused next , it will be incremental and always be fast.

DEMONSTRATION:

Step 1: Production table

create table tab
(
id int,
number int,
creationtime timestamp
);
create index tab_id on tab(creationtime);

Step 2: Duplicate capture table , with one time priming record(this can be removed after the first execution)

create table tab_duplicate
(
id int,
number int,
duplicate_count int,
last_compute_timestamp timestamp);
create index tab_duplicate_idx on tab_duplicate(last_compute_timestamp);
insert into tab_duplicate values(0,0,0,current_timestamp);

Step 3: Some duplicate entry into the production table

insert into tab values(1,10,current_timestamp);
select pg_sleep(1);
insert into tab values(1,10,current_timestamp);
insert into tab values(1,10,current_timestamp);
select pg_sleep(1);
insert into tab values(2,20,current_timestamp);
select pg_sleep(1);
insert into tab values(2,20,current_timestamp);
select pg_sleep(1);
insert into tab values(3,30,current_timestamp);
insert into tab values(3,30,current_timestamp);
select pg_sleep(1);
insert into tab values(4,40,current_timestamp);

Verify records:

postgres=# select * from tab;
id | number | creationtime
----+--------+----------------------------
1 | 10 | 2022-01-23 19:00:37.238865
1 | 10 | 2022-01-23 19:00:38.248574
1 | 10 | 2022-01-23 19:00:38.252622
2 | 20 | 2022-01-23 19:00:39.259584
2 | 20 | 2022-01-23 19:00:40.26655
3 | 30 | 2022-01-23 19:00:41.274673
3 | 30 | 2022-01-23 19:00:41.279298
4 | 40 | 2022-01-23 19:00:52.697257
(8 rows)

Step 4: Duplicates captured and verified.

INSERT INTO tab_duplicate
SELECT a.id,
a.number,
a.duplicate_count,
b.last_compute_timestamp
FROM (SELECT id,
number,
Count(*) duplicate_count
FROM tab,
(SELECT Max(last_compute_timestamp) lct
FROM tab_duplicate) max_date
WHERE creationtime > max_date.lct
GROUP BY id,
number) a,
(SELECT Max(creationtime) last_compute_timestamp
FROM tab,
(SELECT Max(last_compute_timestamp) lct
FROM tab_duplicate) max_date
WHERE creationtime > max_date.lct) b;

Execute:

postgres=# INSERT INTO tab_duplicate
postgres-# SELECT a.id,
postgres-# a.number,
postgres-# a.duplicate_count,
postgres-# b.last_compute_timestamp
postgres-# FROM (SELECT id,
postgres(# number,
postgres(# Count(*) duplicate_count
postgres(# FROM tab,
postgres(# (SELECT Max(last_compute_timestamp) lct
postgres(# FROM tab_duplicate) max_date
postgres(# WHERE creationtime > max_date.lct
postgres(# GROUP BY id,
postgres(# number) a,
postgres-# (SELECT Max(creationtime) last_compute_timestamp
postgres(# FROM tab,
postgres(# (SELECT Max(last_compute_timestamp) lct
postgres(# FROM tab_duplicate) max_date
postgres(# WHERE creationtime > max_date.lct) b;
INSERT 0 4
postgres=#

Verify:

postgres=# select * from tab_duplicate;
id | number | duplicate_count | last_compute_timestamp
----+--------+-----------------+----------------------------
0 | 0 | 0 | 2022-01-23 19:00:25.779671
3 | 30 | 2 | 2022-01-23 19:00:52.697257
1 | 10 | 3 | 2022-01-23 19:00:52.697257
4 | 40 | 1 | 2022-01-23 19:00:52.697257
2 | 20 | 2 | 2022-01-23 19:00:52.697257
(5 rows)

Step 5: Some more duplicates into the production table

insert into tab values(5,50,current_timestamp);
select pg_sleep(1);
insert into tab values(5,50,current_timestamp);
select pg_sleep(1);
insert into tab values(5,50,current_timestamp);
select pg_sleep(1);
insert into tab values(6,60,current_timestamp);
select pg_sleep(1);
insert into tab values(6,60,current_timestamp);
select pg_sleep(1);

Step 6: Same duplicate capture SQL executed will CAPTURE ONLY the incremental records in the production table.


INSERT INTO tab_duplicate
SELECT a.id,
a.number,
a.duplicate_count,
b.last_compute_timestamp
FROM (SELECT id,
number,
Count(*) duplicate_count
FROM tab,
(SELECT Max(last_compute_timestamp) lct
FROM tab_duplicate) max_date
WHERE creationtime > max_date.lct
GROUP BY id,
number) a,
(SELECT Max(creationtime) last_compute_timestamp
FROM tab,
(SELECT Max(last_compute_timestamp) lct
FROM tab_duplicate) max_date
WHERE creationtime > max_date.lct) b;

Execute:

postgres=# INSERT INTO tab_duplicate
postgres-# SELECT a.id,
postgres-# a.number,
postgres-# a.duplicate_count,
postgres-# b.last_compute_timestamp
postgres-# FROM (SELECT id,
postgres(# number,
postgres(# Count(*) duplicate_count
postgres(# FROM tab,
postgres(# (SELECT Max(last_compute_timestamp) lct
postgres(# FROM tab_duplicate) max_date
postgres(# WHERE creationtime > max_date.lct
postgres(# GROUP BY id,
postgres(# number) a,
postgres-# (SELECT Max(creationtime) last_compute_timestamp
postgres(# FROM tab,
postgres(# (SELECT Max(last_compute_timestamp) lct
postgres(# FROM tab_duplicate) max_date
postgres(# WHERE creationtime > max_date.lct) b;
INSERT 0 2

Verify:

postgres=# select * from tab_duplicate;
id | number | duplicate_count | last_compute_timestamp
----+--------+-----------------+----------------------------
0 | 0 | 0 | 2022-01-23 19:00:25.779671
3 | 30 | 2 | 2022-01-23 19:00:52.697257
1 | 10 | 3 | 2022-01-23 19:00:52.697257
4 | 40 | 1 | 2022-01-23 19:00:52.697257
2 | 20 | 2 | 2022-01-23 19:00:52.697257
5 | 50 | 3 | 2022-01-23 19:02:37.884417
6 | 60 | 2 | 2022-01-23 19:02:37.884417
(7 rows)

This duplicate capture will be always fast because of two things:

  1. It works only on incremental data of last whatever duration you schedule it.

  2. Scanning of the table to find the maximum timestamp happens on a single column index (index only scan).

From execution plan:

->  Index Only Scan Backward using tab_duplicate_idx on tab_duplicate tab_duplicate_2  (cost=0.15..77.76 rows=1692 width=8)

CAVEAT : In case, if you have duplicates over longer period of time in table tab_duplicate , you can dedupe records in TAB_DUPLICATION at a periodic duration , say at the end of the day which will anyways be fast because TAB_DUPLICATE is anyway an aggregated small table and the table is OFFLINE to your application whereas TAB is your production table with huge accumulated data.

Also , a trigger on the production table is a viable solution but that adds overhead to transactions on the production as trigger execution has a cost for every insert.



Related Topics



Leave a reply



Submit