Index Spanning Multiple Tables in Postgresql

Index spanning multiple tables in PostgreSQL

No it's not possible to have an index on many tables, also it really wouldn't guarantee speeding up anything since you won't always get an Index Only Scan. What you really want is a materialized view but pg doesn't have those either. You can try implementing it yourself using triggers like this or this.


Update

As noted by @petter. The materialized views were introduced in 9.3.

Does PostgreSQL implement multi-table indexes?

As of the current version of the PostgreSQL (v 12), an index can be based on a table or materialized view only.

https://www.postgresql.org/docs/current/sql-createindex.html

CREATE INDEX constructs an index on the specified column(s) of the
specified relation, which can be a table or a materialized view.

The CREATE INDEX syntax requires a table and there can only 1 table specificed

CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON
[ ONLY ] table_name [ USING method ]

table_name:

The name (possibly schema-qualified) of the table to be indexed.

The materialized view is an option but, the data in materialized view is stale until you refresh the data.

https://www.postgresql.org/docs/12/sql-creatematerializedview.html

CREATE MATERIALIZED VIEW defines a materialized view of a query. The
query is executed and used to populate the view at the time the
command is issued (unless WITH NO DATA is used) and may be refreshed
later using REFRESH MATERIALIZED VIEW.

You maybe able to balance it out by automating a process to run REFRESH MATERIALIZED VIEW command in a way to reduce the likelihood of stale data. For example, after large data imports and at regular intervals at other times. But, if your data is large enough to require indexing, the refresh & re-indexing process will not be fast enough and thus you won't be able to execute it after every CRUD statement in an OLTP scenario.

In conclusion, what you are looking for does not exist in PostgreSQL as of v 12.

What is the optimised way to search a text and string from multiple tables in PostgreSQL

Your idea to search using UNION ALL is good.

To speed up the substring search, you can use a trigram index:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

CREATE INDEX ON accounts USING gin (name gin_trgm_ops);

Optimizing indexes for query on large table with multiple joins

As for the query itself, the only thing you can do is skipping on users table. From EXPLAIN you can see that it only does an Index Only Scan without actually touching the table. So, technically your query could look like this:

SELECT images.* FROM images
INNER JOIN locations ON locations.id = images.location_id
INNER JOIN user_groups ON images.creator_id = user_groups.user_id
WHERE images.deleted_at IS NULL
AND user_groups.group_id = 7
AND images.creator_type = 'User'
AND images.status = 2
AND locations.active = TRUE
ORDER BY date_uploaded DESC
OFFSET 0 LIMIT 50

The rest is about indexes. locations seems to have very little data, so optimization here will gain you nothing. user_groups on the other hand could benefit from an index ON (user_id) WHERE group_id = 7 or ON (group_id, user_id). This should remove some extra filtering on table content.

-- Option 1
CREATE INDEX ix_usergroups_userid_groupid7
ON user_groups (user_id)
WHERE group_id = 7;

-- Option 2
CREATE INDEX ix_usergroups_groupid_userid
ON user_groups (group_id, user_id);

Of course, the biggest thing here is images. Currently, the planer would do an index scan on creator_date_uploaded_Where_pub_not_del which I suspect does not fully match the requirements. Here, multiple options come to mind depending on your usage pattern - from one where the search parameters are rather common:

-- Option 1
CREATE INDEX ix_images_creatorid_typeuser_status2_notdel
ON images (creator_id)
WHERE creator_type = 'User' AND status = 2 AND deleted_at IS NULL;

to one with completely dynamic parameters:

-- Option 2
CREATE INDEX ix_images_status_creatortype_creatorid_notdel
ON images (status, creator_type, creator_id)
WHERE deleted_at IS NULL;

The first index is preferable as it is smaller (values are filtered-out rather than indexed).

To summarize, unless you are limited by memory (or other factors), I would add indexes on user_groups and images. Correct choice of indexes must be confirmed empirically, as multiple options are usually available and the situation depends on statistical distribution of data.

Postgres full text search across multiple related tables

Your best answer is probably to have a separate tsvector column in each table (with an index on, of course). If you aggregate the data up to a shared tsvector, that'll create a lot of updates on that shared one whenever the individual ones update.

You will need one index per table. Then when you query it, obviously you need multiple WHERE clauses, one for each field. PostgreSQL will then automatically figure out which combination of indexes to use to give you the quickest results - likely using bitmap scanning. It will make your queries a little more complex to write (since you need multiple column matching clauses), but that keeps the flexibility to only query some of the fields in the cases where you want.

You cannot create one index that tracks multiple tables. To do that you need the separate tsvector column and triggers on each table to update it.



Related Topics



Leave a reply



Submit