Postgresql Does Not Use a Partial Index

PostgreSQL does not use a partial index

A partial index is a good idea to exclude half the rows of the table which you obviously do not need. Simpler:

CREATE INDEX name_idx ON table (text_col)
WHERE text_col IS NOT NULL;

Be sure to run ANALYZE table after creating the index. (Autovacuum does that automatically after some time if you don't do it manually, but if you test right after creation, your test will fail.)

Then, to convince the query planner that a particular partial index can be used, repeat the WHERE condition in the query - even if it seems completely redundant:

SELECT col1,col2, .. colN
FROM table
WHERE text_col = 'my_value'
AND text_col IS NOT NULL; -- repeat condition

Voilá.

Per documentation:

However, keep in mind that the predicate must match the conditions
used in the queries that are supposed to benefit from the index. To be
precise, a partial index can be used in a query only if the system can
recognize that the WHERE condition of the query mathematically implies
the predicate of the index. PostgreSQL does not have a sophisticated
theorem prover that can recognize mathematically equivalent
expressions that are written in different forms. (Not only is such a
general theorem prover extremely difficult to create, it would
probably be too slow to be of any real use.) The system can recognize
simple inequality implications, for example "x < 1" implies "x < 2";
otherwise the predicate condition must exactly match part of the
query's WHERE condition or the index will not be recognized as usable.
Matching takes place at query planning time, not at run time. As a
result, parameterized query clauses do not work with a partial index.

As for parameterized queries: again, add the (redundant) predicate of the partial index as an additional, constant WHERE condition, and it works just fine.


An important update in Postgres 9.6 largely improves chances for index-only scans (which can make queries cheaper and the query planner will more readily chose such query plans). Related:

  • PostgreSQL not using index during count(*)

Postgres is not using partial index

No rows are removed by the filter in the sequential scan, so it seems that all rows in the table have refuel_request_status_id equal to '1', '2' or '3'. So the WHERE condition in the index doesn't make a difference.

You could try with enable_seqscan = off to see if PostgreSQL made the right choice. If not, perhaps random_page_cost is not correctly set for your hardware.

If it turns out that the sequential scan is indeed the fastest way to go, there is nothing much you can do to speed up the query: you can get faster storage or more RAM to cache the table.

Counting is slow business.

PostgreSQL doesn't consistently use partial index

The thing here that is actually slow is that =ANY is implemented by looping over the 1000 members of your array and testing each one, and doing that for each of the 136000 rows it needs to inspect. That is a lot looping (but not 4 seconds worth in my hands, "only" 1.5s for me). Worse, the planner doesn't anticipate that the =ANY has such a poor implementation, so sees no reason to choose the other plan to avoid it.

v14 will fix this by using a hash table to implement the =ANY, so it will no longer be so inordinately slow.

If you can't/don't want to upgrade to v14, you could rewrite the query by joining to a VALUES list, rather than using =ANY

SELECT * FROM enrollments join (VALUES ('C1'),...,('C1000')) f(c) on c=consumer_id
WHERE deleted_at IS NULL AND portfolio_id = 1

Postgres partial vs regular / full index on nullable column

It all depends.

NULL values are included in (default) B-tree indexes since version Postgres 8.3, like Schwern provided. However, predicates like the one you mention (where colA is not null) are only properly supported since Postgres 9.0. The release notes:

Allow IS NOT NULL restrictions to use indexes (Tom Lane)

This is particularly useful for finding MAX()/MIN() values in
indexes that contain many null values.

GIN indexes followed later:

As of PostgreSQL 9.1, null key values can be included in the index.

Typically, a partial index makes sense if it excludes a major part of the table from the index, making it substantially smaller and saving writes to the index. Since B-tree indexes are so shallow, bare seek performance scales fantastically (once the index is cached). 10 % fewer index entries hardly matter in that area.

Your case would exclude only around 10% of all rows, and that rarely pays. A partial index adds some overhead for the query planner and excludes queries that don't match the index condition. (The Postgres query planner doesn't try hard if the match is not immediately obvious.)

OTOH, Postgres will rarely use an index for predicates retrieving 10 % of the table - a sequential scan will typically be faster. Again, it depends.

If (almost) all queries exclude NULL anyway (in a way the Postgres planner understands), then a partial index excluding only 10 % of all rows is still a sensible option. But it may backfire if query patterns change. The added complexity may not be worth it.

Also worth noting that there are still corner cases with NULL values in Postgres indexes. I bumped into this case recently where Postgres proved unwilling to read sorted rows from a multicolumn index when the first index expression was filtered with IS NULL (making a partial index preferable for the case):

db<>fiddle here

So, it depends on the complete picture.

Postgres IN clause with many values not using partial index

You are running an outdated version of Postgres. Consider upgrading soon.

There are many possible reasons. I suspect a weakness in the selectivity estimation of the outdated version. I vaguely remember a maximum of 100 values for query planning involving arrays that was later improved. IN expressions are typically transformed to = ANY (ARRAY[...]) internally:

  • How to use ANY instead of IN in a WHERE clause with Rails?

Either way, you might be able to fix the behavior by repeating the predicate of the partial index in your query:

SELECT * FROM _order 
WHERE merged_id IN ( 10001 ,10002 ,10003 ....., 11000)
AND merged_id is not null; -- logically redundant

There may be other problems with your server configuration like cost settings or table statistics:

  • Keep PostgreSQL from sometimes choosing a bad query plan

And don't forget to run ANALYZE on your table at least once after creating the partial index. Or, preferably, VACUUM ANALYZE, but that's more expensive for your big table.

  • PostgreSQL partial index unused when created on a table with existing data

However, for long lists of values, there are more efficient query variants to begin with:

SELECT o.*
FROM unnest('{10001 ,10002 ,10003 ....., 11000}'::int8[]) merged_id
JOIN _order o USING (merged_id);

See:

  • Optimizing a Postgres query with a large IN

Partial index not used on count

The partial index may be smaller, but the size of the part of the index that needs to be read will be about the same between the two indexes. Skipping the parts of the complete index which are for the wrong "status" will be very efficient.

Your table is obviously not very well vacuumed, based on both the heap fetches and the number of buffers dirtied and written. Try vacuuming the table and then repeating the queries.

Partial Index on Postgres where condition is in separate table

Not possible. From the manual:

The expression used in the WHERE clause can refer only to columns of
the underlying table, but it can use all columns, not just the ones
being indexed.

Indexing the condition of a partial index in Postgres

A b-tree index can be thought of an ordered list of index entries, each with a pointer to a row in the table.

In a partial index, the list is just smaller: there are only index entries for rows that meet the condition.

If you have the index condition in your WHERE clause, PostgreSQL knows it can use the index and doesn't have to check the index condition, because it will be satisfied automatically.

So:

  1. No, any row found via the index will automatically satisfy the index condition, so using the index is enough to make sure it is satisfied.

  2. No, an index on a boolean column will not be used, because it would not be cheaper than this partial index, and the partial index can be used to check the condition on order_nr as well.

    It is actually the other way around: the partial index could well be used for queries that only have the boolean column in the WHERE condition, if there are few enough rows that satisfy the condition.



Related Topics



Leave a reply



Submit