Fastest Way to Find String by Substring in SQL

Fastest way to find string by substring in SQL?

If you want to use less space than Randy's answer and there is considerable repetition in your data, you can create an N-Ary tree data structure where each edge is the next character and hang each string and trailing substring in your data on it.

You number the nodes in depth first order. Then you can create a table with up to 255 rows for each of your records, with the Id of your record, and the node id in your tree that matches the string or trailing substring. Then when you do a search, you find the node id that represents the string you are searching for (and all trailing substrings) and do a range search.

Find substring in string

You can use LIKE:

SELECT * FROM YourTable t
WHERE 'random words ....' LIKE '%' + t.column + '%'

SELECT * FROM YourTable t
WHERE t.column LIKE '%random words ....%'

Depends what did you mean, first one select the records that the column has a part of the provided string. The second one is the opposite.

SQL SELECT WHERE field contains words

Rather slow, but working method to include any of words:

SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
   OR column1 LIKE '%word2%'
   OR column1 LIKE '%word3%'

If you need all words to be present, use this:

SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
  AND column1 LIKE '%word2%'
  AND column1 LIKE '%word3%'

If you want something faster, you need to look into full text search, and this is very specific for each database type.

What is the best way to find string containing a sequence of identical digits in SQL? (fake phones search)

Update: To find recurring numbers, you can use a function like this. It returns 1 if any of the characters in @text are used at least @min times consecutively.

create function IsRecurring(@text varchar(255), @min int) returns int as
begin
    declare @i int = len(@text) - @min + 1
    declare @result int = 0
    while @i > 0 begin
        if replace(substring(@text, @i, @min), substring(@text, @i, 1), '') = '' begin
            select @result = 1
            break
        end
        select @i = @i - 1
    end
    return @result
end

Example usage:

select dbo.IsRecurring('84455552', 4)

Returns 1 if any of the characters are used at least 4 times consecutively.

How to check if a column contains a substring of string in SQL?

You just flip the two terms in your LIKE operator:

SELECT * 
FROM mytable
WHERE 'words' LIKE CONCAT('%',name,'%')

I believe that LOCATE() and INSTR() may work here too which looks nicer since there isn't a need for concatenating the search term/substring.

SELECT *
FROM mytable
WHERE INSTR('words', name) > 0

Check if a string contains a substring in SQL Server 2005, using a stored procedure

CHARINDEX() searches for a substring within a larger string, and returns the position of the match, or 0 if no match is found

if CHARINDEX('ME',@mainString) > 0
begin
    --do something
end

Edit or from daniels answer, if you're wanting to find a word (and not subcomponents of words), your CHARINDEX call would look like:

CHARINDEX(' ME ',' ' + REPLACE(REPLACE(@mainString,',',' '),'.',' ') + ' ')

(Add more recursive REPLACE() calls for any other punctuation that may occur)

Fastest way to find records that end with key

Unfortunately, searching for strings ending with a particular pattern is difficult on most databases⁺, because searching for string suffixes cannot use an index. This results in full table scans, which may be slow on tables with millions of rows.

If your database supports reverse indexes, add one for your string key column; otherwise, you can improve performance by simulating reverse indexes:

Add a column for storing your string key in reverse
If your RDBMS supports computed columns, add one for the reversed key
Otherwise, define a trigger that populates the reversed column from the key column
Create an index on the reversed column
Use the reversed column for your searches by passing in the reversed suffix that you are looking for.

For example, if you have data like this

key
-----------
01-02-3-xyz
07-12-8-abc

then the augmented table would have

key           rev_key
-----------   -----------
01-02-3-xyz   zyx-3-20-10
07-12-8-abc   cba-8-21-70

and your search for ENDS_WITH(key, '3-xyz') would ask for STARTS_WITH(rev_key, 'zyx-3'). Since string indexes speed up lookups by prefix, the "starts with" lookup would go much faster.

⁺ One notable exception is Oracle, which provides reverse key indexes specifically for situations like this.

What is the fastest way to find the occurrence of a string in another string?

strpos seems to be in the lead, I've tested it with finding some strings in 'The quick brown fox jumps over the lazy dog':

strstr used 0.48487210273743 seconds for 1000000 iterations finding 'quick'
strpos used 0.40836095809937 seconds for 1000000 iterations finding 'quick'
strstr used 0.45261287689209 seconds for 1000000 iterations finding 'dog'
strpos used 0.39890813827515 seconds for 1000000 iterations finding 'dog'

<?php

    $haystack = 'The quick brown fox jumps over the lazy dog';

    $needle = 'quick';

    $iter = 1000000;

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strstr($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strstr used $duration microseconds for $iter iterations finding 'quick' in 'The quick brown fox jumps over the lazy dog'";

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strpos($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strpos used $duration microseconds for $iter iterations finding 'quick' in 'The quick brown fox jumps over the lazy dog'";

    $needle = 'dog';

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strstr($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strstr used $duration microseconds for $iter iterations finding 'dog' in 'The quick brown fox jumps over the lazy dog'";

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strpos($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strpos used $duration microseconds for $iter iterations finding 'dog' in 'The quick brown fox jumps over the lazy dog'";

?>

Fastest Way to Find String by Substring in SQL