Performance of Like '%Query%' VS Full Text Search Contains Query

Performance of like '%Query%' vs full text search CONTAINS query

Full Text Searching (using the CONTAINS) will be faster/more efficient than using LIKE with wildcarding. Full Text Searching (FTS) includes the ability to define Full Text Indexes, which FTS can use. I don't know why you wouldn't define a FTS index if you intended to use the functionality.

LIKE with wildcarding on the left side (IE: LIKE '%Search') can not use an index (assuming one exists for the column), guaranteeing a table scan. I haven't tested & compared, but regex has the same pitfall. To clarify, LIKE '%Search' and LIKE '%Search%' can not use an index; LIKE 'Search%' can use an index.

SQL full text search vs LIKE

Full text search is likely to be quicker since it will benefit from an index of words that it will use to look up the records, whereas using LIKE is going to need to full table scan.

In some cases LIKE will more accurate since LIKE "%The%" AND LIKE "%Matrix" will pick out "The Matrix" but not "Matrix Reloaded" whereas full text search will ignore "The" and return both. That said both would likely have been a better result.

Performance with LIKE vs CONTAINS using full-text indexing

Full-Text indexing is about searching for language words in unstructured text data. Your data doesn't contain words, just a sequence of characters.

I haven't tested this, but I would expect that LIKE would actually be faster, as long as your data is indexed. CONTAINS is meant for searching for words & word-like structures.

If your requirement is for "auto-complete", then LIKE will perform pretty well since the optimizer will use an INDEX SEEK when you search for something such as LIKE 'F5521%'.

This MSDN article explains the basics of the CONTAINS keyword.

SQL 'like' vs '=' performance

See https://web.archive.org/web/20150209022016/http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx

Quote from there:

the rules for index usage with LIKE
are loosely like this:

  • If your filter criteria uses equals =
    and the field is indexed, then most
    likely it will use an INDEX/CLUSTERED
    INDEX SEEK

  • If your filter criteria uses LIKE,
    with no wildcards (like if you had a
    parameter in a web report that COULD
    have a % but you instead use the full
    string), it is about as likely as #1
    to use the index. The increased cost
    is almost nothing.

  • If your filter criteria uses LIKE, but
    with a wildcard at the beginning (as
    in Name0 LIKE '%UTER') it's much less
    likely to use the index, but it still
    may at least perform an INDEX SCAN on
    a full or partial range of the index.

  • HOWEVER, if your filter criteria uses
    LIKE, but starts with a STRING FIRST
    and has wildcards somewhere AFTER that
    (as in Name0 LIKE 'COMP%ER'), then SQL
    may just use an INDEX SEEK to quickly
    find rows that have the same first
    starting characters, and then look
    through those rows for an exact match.


(Also keep in mind, the SQL engine
still might not use an index the way
you're expecting, depending on what
else is going on in your query and
what tables you're joining to. The
SQL engine reserves the right to
rewrite your query a little to get the
data in a way that it thinks is most
efficient and that may include an
INDEX SCAN instead of an INDEX SEEK)

What is Full Text Search vs LIKE

In general, there is a tradeoff between "precision" and "recall". High precision means that fewer irrelevant results are presented (no false positives), while high recall means that fewer relevant results are missing (no false negatives). Using the LIKE operator gives you 100% precision with no concessions for recall. A full text search facility gives you a lot of flexibility to tune down the precision for better recall.

Most full text search implementations use an "inverted index". This is an index where the keys are individual terms, and the associated values are sets of records that contain the term. Full text search is optimized to compute the intersection, union, etc. of these record sets, and usually provides a ranking algorithm to quantify how strongly a given record matches search keywords.

The SQL LIKE operator can be extremely inefficient. If you apply it to an un-indexed column, a full scan will be used to find matches (just like any query on an un-indexed field). If the column is indexed, matching can be performed against index keys, but with far less efficiency than most index lookups. In the worst case, the LIKE pattern will have leading wildcards that require every index key to be examined. In contrast, many information retrieval systems can enable support for leading wildcards by pre-compiling suffix trees in selected fields.

Other features typical of full-text search are

  • lexical analysis or tokenization—breaking a
    block of unstructured text into
    individual words, phrases, and
    special tokens
  • morphological
    analysis, or stemming—collapsing variations
    of a given word into one index term;
    for example, treating "mice" and
    "mouse", or "electrification" and
    "electric" as the same word
  • ranking—measuring the
    similarity of a matching record to
    the query string

Postgresql ILIKE versus TSEARCH

A full text search setup is not identical to a "contains" like query. It stems words etc so you can match "cars" against "car".

If you really want a fast ILIKE then no standard database index or FTS will help. Fortunately, the pg_trgm module can do that.

  • http://www.postgresql.org/docs/9.1/static/pgtrgm.html
  • http://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/

SQL Server Index - Any improvement for LIKE queries?

Only if you add full-text searching to those columns, and use the full-text query capabilities of SQL Server.

Otherwise, no, an index will not help.

Text search in MySQL - Performance and Alternatives

MySQL isn't a search engine, it's a Relation Database Management System (RDBMS). However, you can implement native MySQL tools to emulate Full-Text searching capabilities, such as setting up a search table as MyISAM and adding a FULLTEXT index to columns you wish to search upon. You can read the MySQL docs for more info on how MySQL supports Full-Text searching.

Even if you get Full-Text search queries to work the way you want, you will still miss out on a whole host of features that a true search engine (Lucene) supports. Features such as a facets, spatial searches, result boosting, weighting, etc. I'd suggest you read up on Apache SOLR, as it supports all these features and many more. There is even a PHP SOLR API which you can use to access a SOLR instance.

I'm not saying to abandon MySQL altogether, but use it for it's intended purpose, to persistently store data which can be queried upon, and which can be uses to populate your search engine indices. SOLR even has a built in Document Import Handler, which will allow you to set a database query to be used when you want to mass import data from your MySQL database.

The learning curve is relatively high, as it is with learning most new technologies, but when you are done you will wonder how you ever got by without using a true Full-Text search engine.



Related Topics



Leave a reply



Submit