Which SQL Query Is Better, Match Against or Like

Which SQL query is better, MATCH AGAINST or LIKE?


Update

As of MySQL 5.6 and later, InnoDB tables supports Match... Against.


The first is much better. On MyISAM tables it will use a full text index against those columns. The other will do a full table scan doing a concat on every row and then a comparison.

LIKE is only efficient if you're doing it against:

  • a column (not a result of a function unless your particular database vendor supports functional indexes--Oracle, for example--and you're using them);
  • the start of the column (ie LIKE 'blah%' as opposed to LIKE '%blah%'); and
  • a column that's indexed.

If any one of those conditions are not true the only way for the SQL engine to execute the query is by doing a full table scan. This can be usable under about 10-20 thousand rows. Beyond that it quickly becomes unusable however.

Note: One problem with MATCH on MySQL is that it seems to only match against whole words so a search for 'bla' won't match a column with a value of 'blah', but a search for 'bla*' will.

MySQL LIKE operator Vs MATCH AGAINST

Your searches aren't equivalent. LIKE %1% will find ANYTHING that contains a 1, e.g. 100, 911, 0.1. It's just a plain substring match. MATCH ('+1') would theoretically work, but FULLTEXT by default ignores any "words" that are <4 characters in length. However, assuming you relaxed the fulltext length limit, +1 would find any INDEPENDENT 1, but not any that are embedded in another word. For that you'd need +*1*.

MySQL search query optimization: Match...Against vs %LIKE%

You dont have to switch to MyIsam. Fulltext indexing is supported in Mysql 5.6 and higher.

I usually recommend using fulltext indexes. Create a fulltext index on your columns title,author,year

Then you can run a fulltext query on all 3 at the same time, and apply IN BOOLEAN MODE to really narrow your searches. This is ofcourse something you have to decide for yourself but the options in fulltext are more.

However, if you are running queries that spawn between a range, date for instance or a simple string. Then a standard index is better but for tekst searching in different columns, fulltext index is the way to go!

Read this: http://dev.mysql.com/doc/refman/5.6/en/fulltext-search.html

sql query using LIKE or MATCH

You can specify condition multiple times:

SELECT * 
FROM table
WHERE col1 LIKE '%abc%'
OR col2 LIKE '%abc%'
OR col3 LIKE '%abc%'
OR col4 LIKE '%abc%'
OR col5 LIKE '%abc%';

This will be really slow because you have multiple OR and non-SARGable condition.

Alternatively:

SELECT * 
FROM table
WHERE CONCAT_WS('^', col1,col2,col3,col4,col5) LIKE '%abc%';


Using MATCH (preferred solution that utilizes full-text index):

SELECT * 
FROM table
WHERE MATCH(col1, col2,col3,col4, col5) AGAINST ('abc');

SqlFiddleDemo

Keep in mind that to use MATCH you need to create index first:

ALTER TABLE tab ADD FULLTEXT ft_index_name (col1,col2,col3,col4,col5);

Mysql match...against vs. simple like %term%

The difference is in the algorithm's that MySQL uses behind the scenes find your data. Fulltext searches also allow you sort based on relevancy. The LIKE search in most conditions is going to do a full table scan, so depending on the amount of data, you could see performance issues with it. The fulltext engine can also have performance issues when dealing with large row sets.

On a different note, one thing I would add to this code is something to escape the exploded values. Perhaps a call to mysql_real_escape_string()

MySql `MATCH AGAINST` and `LIKE` combination to search for special characters

Indexes can help with speed by limiting the number of rows to look at. Most code shown so far requires testing every row.

  • FULLTEXT is very good at finding rows when its rules apply. I doubt if +DN* applies due to word-length and existence of punctuation.
  • `LIKE "DN-NP%" can use an index very efficiently. But that only works for the string being at the start of the column.
  • `LIKE "%DN-NP%" -- The leading wildcard requires checking every row.
  • LOCATE and any other string operator -- not sargable, so needs to look at every row.
  • REGEXP "DN-NP" -- slower than LIKE. (There are other situations where REGEXPcan be faster and/orLIKE` won't apply.)

If you have the min word-length set to 2, then this trick may be the most efficient:

WHERE MATCH(col) AGAINST("+DN +NP" IN BOOLEAN MODE)
AND col LIKE '%DN-NP%'

The MATCH will efficiently whittle down the number of rows; the LIKE will make further whittle down the number or rows, but only looking at the small number from the MATCH.

Caveat: Which of these do you need to match or not match?:

abc DN-NP def
abc DNs-NPed def -- look likes "plural", etc which FULLTEXT matches
abc DN-NPQRS def -- word boundary issue
abc ZYXDN-NP def

REGEXP can match a "word boundary"; LIKE does not have such.

Please build a list of things you want to match / not match. We might have a better answer for you.

Why does MATCH AGAINST return different results than LIKE?

The + operator is used in searches IN BOOLEAN MODE. I think it will be ignored in NATURAL LANGUAGE MODE (default).

Try:

SELECT * FROM object_search
WHERE MATCH (keywords)
AGAINST ('+woman +man' IN BOOLEAN MODE); -- could return rows containing both "man" and "woman" (ignoring ft_min_word_len, see below)

Besides, fulltext indexes will cover words only. Punctuation signs (such as ,) will always be ignored. You cannot "fulltext-search" non-alphanumeric characters.

Finally, by default, words shorter than 4 characters are ignored. Therefore, by default, "man" is not indexed. This limit can be changed through the ft_min_word_len configuration option.

Also, mind the stopwords (common words that are never indexed).

SQL 'like' vs '=' performance

See https://web.archive.org/web/20150209022016/http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx

Quote from there:

the rules for index usage with LIKE
are loosely like this:

  • If your filter criteria uses equals =
    and the field is indexed, then most
    likely it will use an INDEX/CLUSTERED
    INDEX SEEK

  • If your filter criteria uses LIKE,
    with no wildcards (like if you had a
    parameter in a web report that COULD
    have a % but you instead use the full
    string), it is about as likely as #1
    to use the index. The increased cost
    is almost nothing.

  • If your filter criteria uses LIKE, but
    with a wildcard at the beginning (as
    in Name0 LIKE '%UTER') it's much less
    likely to use the index, but it still
    may at least perform an INDEX SCAN on
    a full or partial range of the index.

  • HOWEVER, if your filter criteria uses
    LIKE, but starts with a STRING FIRST
    and has wildcards somewhere AFTER that
    (as in Name0 LIKE 'COMP%ER'), then SQL
    may just use an INDEX SEEK to quickly
    find rows that have the same first
    starting characters, and then look
    through those rows for an exact match.


(Also keep in mind, the SQL engine
still might not use an index the way
you're expecting, depending on what
else is going on in your query and
what tables you're joining to. The
SQL engine reserves the right to
rewrite your query a little to get the
data in a way that it thinks is most
efficient and that may include an
INDEX SCAN instead of an INDEX SEEK)

Mysql select by best match with like

You can readily order the results by the number of matches:

SELECT `id`
FROM `accounts`
WHERE AES_DECRYPT(`email`, '123') = CONCAT_WS('@', 'test', 'test.com') OR
AES_DECRYPT(`email`, '123') LIKE CONCAT('%','test.com')
ORDER BY ( (AES_DECRYPT(`email`, '123') = CONCAT_WS('@', 'test', 'test.com')) +
(AES_DECRYPT(`email`, '123') LIKE CONCAT('%','test.com'))
);

This will work for your example.



Related Topics



Leave a reply



Submit