How to Use Many Like Operators and Use Index

How to use many LIKE operators and use index

You can create a trigram index that will support your query.

For that you need the pg_trgm extension; run the following as superuser:

CREATE EXTENSION pg_trgm;

Then you can create a GIN index:

CREATE INDEX ON dir USING gin (path gin_trgm_ops);

This index can be used with your second and third approach, so it should do the trick for you.

With short patterns like the one in your examples, the index will not be very effective.

You can also use a GiST index, that will probably be smaller, but slower to search.

Note that you can use that index also with patterns that start with %.

Best type of indexing when there is LIKE clause

Yep, what you have here is a database killer

A B-tree index can be used for column comparisons in expressions that
use the =, >, >=, <, <=, or BETWEEN operators. The index also can be
used for LIKE comparisons if the argument to LIKE is a constant string
that does not start with a wildcard character.

Source: http://dev.mysql.com/doc/refman/5.7/en/index-btree-hash.html

So that means your LIKE query cannot use the index and then you have two likes connected with an OR. If that's not enough, you have thrown in a NOT IN comparison as well.

But fortunately, the second LIKE expression isn't so bad, it doesn't start with a wildcard. So your best hope is to create a composite index on usage_guidance, name

If you could post your SHOW CREATE TABLE and a few lines of sample data + the expected output, we might get an idea if there is a way to rewrite this query.

How can I introduce multiple conditions in LIKE operator?

Here is an alternative way:

select * from tbl where col like 'ABC%'
union
select * from tbl where col like 'XYZ%'
union
select * from tbl where col like 'PQR%';

Here is the test code to verify:

create table tbl (col varchar(255));
insert into tbl (col) values ('ABCDEFG'), ('HIJKLMNO'), ('PQRSTUVW'), ('XYZ');
select * from tbl where col like 'ABC%'
union
select * from tbl where col like 'XYZ%'
union
select * from tbl where col like 'PQR%';
+----------+
| col |
+----------+
| ABCDEFG |
| XYZ |
| PQRSTUVW |
+----------+
3 rows in set (0.00 sec)

Is there a combination of LIKE and IN in SQL?

There is no combination of LIKE & IN in SQL, much less in TSQL (SQL Server) or PLSQL (Oracle). Part of the reason for that is because Full Text Search (FTS) is the recommended alternative.

Both Oracle and SQL Server FTS implementations support the CONTAINS keyword, but the syntax is still slightly different:

Oracle:

WHERE CONTAINS(t.something, 'bla OR foo OR batz', 1) > 0

SQL Server:

WHERE CONTAINS(t.something, '"bla*" OR "foo*" OR "batz*"')

The column you are querying must be full-text indexed.

Reference:

  • Building Full-Text Search Applications with Oracle Text
  • Understanding SQL Server Full-Text

SQL Server Index - Any improvement for LIKE queries?

Only if you add full-text searching to those columns, and use the full-text query capabilities of SQL Server.

Otherwise, no, an index will not help.

Why can WHERE col LIKE 'a%' use an index, but not WHERE col LIKE '%a%'?

Although the actual details of MySQL's B-tree indexes are more complicated than this, for most purposes it's close enough to say that having an index on a column lets the MySQL engine perform SELECTs on your table as if it was ordered by that column.

If the code column has an index on it, and you're searching for records where code LIKE 'a%', then all MySQL (or whatever other SQL package, as long as it's sufficiently clever) has to do is spit out all the records from the start of 'a' to to the start of 'b'. However, if you're searching for records where code LIKE '%a%', then having the table already ordered by code won't help you, because whether a row matches the WHERE clause has no simple relationship to its position in the index. So for the second query, there's nothing the database can reasonably do except check every character of the code entry of every single row in the table (unless it already has the result cached).

This is fairly easy to understand intuitively, because you can imagine doing something reasonably analogous yourself, as a human. If you want to find all the words in the Oxford English Dictionary that begin with 'a', then you just go through all the pages from the start of 'a' to the start of 'b', and everything you see is a word starting with 'a'. If you want to find all the words in the dictionary with an 'a' in them anywhere, then the dictionary being ordered doesn't offer you much help. If you're sophisticated enough, you can plausibly exploit the ordering of the dictionary a little (such as by using your knowledge that all the words before the first 'b...' word in the dictionary contain an 'a'), but ultimately you're gonna have to look at almost every single word.

Correct indexing when using OR operator

You misunderstand how indexes work.

Think of a telephone book (the equivalent of a two-column index on last name first, first name last). If I ask you to find all people in the telephone book whose last name is "Smith," you can benefit from the fact that the names are ordered that way; you can assume that the Smiths are organized together. But if I ask you to find all the people whose first name is "John" you get no benefit from the index. Johns can have any last name, and so they are scattered throughout the book and you end up having to search the hard way, from cover to cover.

Now if I ask you to find all people whose last name is "Smith" OR whose first name is "John", you can find the Smiths easily as before, but that doesn't help you at all to find the Johns. They're still scattered throughout the book and you have to search for them the hard way.

It's the same with multi-column indexes in SQL. The index is sorted by the first column, then sorted by the second column in cases of ties in the first column, then sorted by the third column in cases of ties in both the first two columns, etc. It is not sorted by all columns simultaneously. So your multi-column index doesn't help to make your search terms more efficient, except for the left-most column in the index.

Back to your original question.

What would be the correct way to index such a table for this query?

Create a separate, single-column index on each column. One of these indexes will be a better choice than the others, based on MySQL's estimation of how many I/O operations the index will incur if it is used.

Modern versions of MySQL also have some smarts about index merging, so the query may use more than one index in a given table, and then try to merge the results. Otherwise MySQL tends to be limited to use one index per table in a given query.

Another trick that a lot of people use successfully is to do a separate query for each of your indexed columns (which should use the respective index) and then UNION the results.

SELECT fields FROM table WHERE field1='something' 
UNION
SELECT fields FROM table WHERE field2='something'
UNION
SELECT fields FROM table WHERE field3='something'
UNION
SELECT fields FROM table WHERE field4='something'

One final observation: if you find yourself searching for the same 'something' across four fields, you should reconsider if all four fields are actually the same thing, and you're guilty of designing a table that violates First Normal form with repeating groups. If so, perhaps field1 through field4 belong in a single column in a child table. Then it becomes a lot easier to index and query:

SELECT fields from table INNER JOIN child_table ON table.pk = child_table.fk
WHERE child_table.field = 'something'


Related Topics



Leave a reply



Submit