How to force PostgreSQL to use my index?
Overriding cost parameters
You can't ever force PostgreSQL to use a particular index, or totally prevent it from doing a seqscan.
However, you can tell it to avoid doing certain scan types if it possibly can, by setting the relevant
enable_ parameters to
off. It's really a feature intended only for debugging.
For testing, try:
SET enable_seqscan = off;
if Pg can possibly use an index scan (or something else) it will.
You may also want to consider:
SET random_page_cost = 1.1
i.e. tell PostgreSQL that random I/O is only slightly more expensive than sequential I/O. This is usually true on systems with SSDs, or where most of the DB is cached in RAM. It will be more likely to choose an index in this case.
Of course, if your system's random I/O is actually more expensive, then using an index is likely to be slower.
Selectivity, partial indexes
What you should really do is follow the advice you've already been given. Create the index in order of selectivity - if
relevant is less common, use that. You can even go a step further and create a partial index:
CREATE INDEX idx_name_blah ON tbl_name_blah (factory_key) WHERE (NOT relevant);
This index only contains values for
relevant = 'f'. It can only be used for queries where the planner knows relevant will be false. On the other hand, it will be a much smaller, faster index.
You might also have inaccurate statistics, causing PostgreSQL to think value frequencies are different than they really are for your table.
explain analyze will help show this.
You can also just
ANALYZE my_table in case the stats are just out of date; if so, increase the frequency with which autovacuum runs because it's not keeping up.
If the stats are up to date but the planner is still making stats-based mis-estimations, increasing the statistics target for the table (see manual) and re-analyzing can help if it is actually a statistics mis-estimation problem.
Older PostgreSQL versions tend to be less smart about cost estimation, query optimization, statistics, query execution methods, and pretty much everything else.
If you're not on the latest version, upgrade.
For example, 9.2's index-only scans would allow you to create a partial index
(product_id, factory_key) WHERE (NOT relevant)
and then run a query:
SELECT product_id, factory_key FROM my_table WHERE NOT relevant;
that should only read the index, with no heap access at all.
How to make postgres not use a particular index?
You can probably disable the index by doing some dummy arithmetic on the indexed column.
...AND "chaindata_tokentransfer"."chain_id" + 0 = 1...
If you put that into production, make sure to add a code comment on why you are doing such an odd thing.
I'm curious why it chooses to use that index, despite apparently knowing how astonishingly awful it is. If you show the plan for the query with the index disabled, maybe we could figure that out.
If the dummy arithmetic doesn't work, what you could do is start a transaction, drop the index, execute the query (or the just the EXPLAIN of it), then rollback the drop. That is probably not something you want to do often in production (especially since the table will be locked from when the index is dropped until the rollback. Also because you might accidentally commit!) but getting the plan is probably worth doing it once.
PostgreSQL: specify an index to use
I'm afraid there is no such thing in Postgres:
Many people over the years have requested that the PostgreSQL project
implement "optimizer hints" or "query hints" as they are implemented
in other RDBMSes such as Oracle and MySQL. The official current stance
from the community is this:
We are not interested in implementing
hints in the exact ways they are commonly implemented on other
databases. Proposals based on "because they've got them" will not be
welcomed. If you have an idea that avoids the problems that have been
observed with other hint systems, that could lead to valuable
The wiki lists some alternatives to hinting here, however none of these suggestions are equivalents of what you are looking for.
However, you may be able to force the join order as described in 14.3. Controlling the Planner with Explicit JOIN Clauses, depending on what the values for
join_collapse_limit are (you may be able to set/reset them on the fly). This can then indirectly affect which indexes are used, but again there is no explicit choice of index.
Postgres query optimization (forcing an index scan)
For testing purposes you can force the use of the index by "disabling" sequential scans - best in your current session only:
SET enable_seqscan = OFF;
Do not use this on a productive server. Details in the manual here.
I quoted "disabling", because you cannot actually disable sequential table scans. But any other available option is now preferable for Postgres. This will prove that the multicolumn index on
(metric_id, t) can be used - just not as effective as an index on the leading column.
You probably get better results by switching the order of columns in your
PRIMARY KEY (and the index used to implement it behind the curtains with it) to
(t, metric_id). Or create an additional index with reversed columns like that.
- Is a composite index also good for queries on the first field?
You do not normally have to force better query plans by manual intervention. If setting
enable_seqscan = OFF leads to a much better plan, something is probably not right in your database. Consider this related answer:
- Keep PostgreSQL from sometimes choosing a bad query plan
How to Create Multiple One to One's
MySQL Select Dynamic Row Values as Column Names, Another Column as Value
Postgresql: Using a Calculated Column in the Same Query
Generate_Series() Equivalent in MySQL
Is There a Max Function in SQL Server That Takes Two Values Like Math.Max in .Net
Multi-Statement Table Valued Function VS Inline Table Valued Function
What Is the Simplest SQL Query to Find the Second Largest Value
MySQL Insert into Table Values.. VS Insert into Table Set
In Sql, What's the Difference Between Count(Column) and Count(*)
Convert Varchar into Datetime in SQL Server
MySQL Unknown Column in on Clause
How to Use Any Instead of in in a Where Clause
Insert into Multiple Tables in One Query
SQL Server 2005 Pivot on Unknown Number of Columns
Does Count(*) Always Return a Result
How to Split a Single Column Values to Multiple Column Values