Fastest way to find string by substring in SQL?
If you want to use less space than Randy's answer and there is considerable repetition in your data, you can create an N-Ary tree data structure where each edge is the next character and hang each string and trailing substring in your data on it.
You number the nodes in depth first order. Then you can create a table with up to 255 rows for each of your records, with the Id of your record, and the node id in your tree that matches the string or trailing substring. Then when you do a search, you find the node id that represents the string you are searching for (and all trailing substrings) and do a range search.
Find substring in string
You can use LIKE
:
SELECT * FROM YourTable t
WHERE 'random words ....' LIKE '%' + t.column + '%'
Or
SELECT * FROM YourTable t
WHERE t.column LIKE '%random words ....%'
Depends what did you mean, first one select the records that the column has a part of the provided string. The second one is the opposite.
SQL SELECT WHERE field contains words
Rather slow, but working method to include any of words:
SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
OR column1 LIKE '%word2%'
OR column1 LIKE '%word3%'
If you need all words to be present, use this:
SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
AND column1 LIKE '%word2%'
AND column1 LIKE '%word3%'
If you want something faster, you need to look into full text search, and this is very specific for each database type.
What is the best way to find string containing a sequence of identical digits in SQL? (fake phones search)
Update: To find recurring numbers, you can use a function like this. It returns 1 if any of the characters in @text are used at least @min times consecutively.
create function IsRecurring(@text varchar(255), @min int) returns int as
begin
declare @i int = len(@text) - @min + 1
declare @result int = 0
while @i > 0 begin
if replace(substring(@text, @i, @min), substring(@text, @i, 1), '') = '' begin
select @result = 1
break
end
select @i = @i - 1
end
return @result
end
Example usage:
select dbo.IsRecurring('84455552', 4)
Returns 1 if any of the characters are used at least 4 times consecutively.
How to check if a column contains a substring of string in SQL?
You just flip the two terms in your LIKE
operator:
SELECT *
FROM mytable
WHERE 'words' LIKE CONCAT('%',name,'%')
I believe that LOCATE()
and INSTR()
may work here too which looks nicer since there isn't a need for concatenating the search term/substring.
SELECT *
FROM mytable
WHERE INSTR('words', name) > 0
Check if a string contains a substring in SQL Server 2005, using a stored procedure
CHARINDEX()
searches for a substring within a larger string, and returns the position of the match, or 0 if no match is found
if CHARINDEX('ME',@mainString) > 0
begin
--do something
end
Edit or from daniels answer, if you're wanting to find a word (and not subcomponents of words), your CHARINDEX
call would look like:
CHARINDEX(' ME ',' ' + REPLACE(REPLACE(@mainString,',',' '),'.',' ') + ' ')
(Add more recursive REPLACE() calls for any other punctuation that may occur)
Fastest way to find records that end with key
Unfortunately, searching for strings ending with a particular pattern is difficult on most databases+, because searching for string suffixes cannot use an index. This results in full table scans, which may be slow on tables with millions of rows.
If your database supports reverse indexes, add one for your string key column; otherwise, you can improve performance by simulating reverse indexes:
- Add a column for storing your string key in reverse
- If your RDBMS supports computed columns, add one for the reversed key
- Otherwise, define a trigger that populates the reversed column from the key column
- Create an index on the reversed column
- Use the reversed column for your searches by passing in the reversed suffix that you are looking for.
For example, if you have data like this
key
-----------
01-02-3-xyz
07-12-8-abc
then the augmented table would have
key rev_key
----------- -----------
01-02-3-xyz zyx-3-20-10
07-12-8-abc cba-8-21-70
and your search for ENDS_WITH(key, '3-xyz')
would ask for STARTS_WITH(rev_key, 'zyx-3')
. Since string indexes speed up lookups by prefix, the "starts with" lookup would go much faster.
+ One notable exception is Oracle, which provides reverse key indexes specifically for situations like this.
What is the fastest way to find the occurrence of a string in another string?
strpos
seems to be in the lead, I've tested it with finding some strings in 'The quick brown fox jumps over the lazy dog'
:
strstr
used 0.48487210273743 seconds for 1000000 iterations finding'quick'
strpos
used 0.40836095809937 seconds for 1000000 iterations finding'quick'
strstr
used 0.45261287689209 seconds for 1000000 iterations finding'dog'
strpos
used 0.39890813827515 seconds for 1000000 iterations finding'dog'
<?php
$haystack = 'The quick brown fox jumps over the lazy dog';
$needle = 'quick';
$iter = 1000000;
$start = microtime(true);
for ($i = 0; $i < $iter; $i++) {
strstr($haystack, $needle);
}
$duration = microtime(true) - $start;
echo "<br/>strstr used $duration microseconds for $iter iterations finding 'quick' in 'The quick brown fox jumps over the lazy dog'";
$start = microtime(true);
for ($i = 0; $i < $iter; $i++) {
strpos($haystack, $needle);
}
$duration = microtime(true) - $start;
echo "<br/>strpos used $duration microseconds for $iter iterations finding 'quick' in 'The quick brown fox jumps over the lazy dog'";
$needle = 'dog';
$start = microtime(true);
for ($i = 0; $i < $iter; $i++) {
strstr($haystack, $needle);
}
$duration = microtime(true) - $start;
echo "<br/>strstr used $duration microseconds for $iter iterations finding 'dog' in 'The quick brown fox jumps over the lazy dog'";
$start = microtime(true);
for ($i = 0; $i < $iter; $i++) {
strpos($haystack, $needle);
}
$duration = microtime(true) - $start;
echo "<br/>strpos used $duration microseconds for $iter iterations finding 'dog' in 'The quick brown fox jumps over the lazy dog'";
?>
Related Topics
Deleting Hierarchical Data in SQL Table
How to Speed Up Row_Number in Oracle
Rename a Constraint in SQL Server
Why Can't I Use "Create Schema" in a Begin/End Block in SQL Management Studio
Drop Default Constraint on a Column in Tsql
Is MySQL Limit Applied Before or After Order By
When to Use an Enum or a Small Table in a Relational Database
What Is the Equivalent of 'Go' in MySQL
SQL Server 2008: Ordering by Datetime Is Too Slow
Convert 24 Hour Time to 12 Hour Plus Am/Pm Indication Oracle SQL
Select Multiple (Non-Aggregate Function) Columns with Group By
How to Add a Unique Constraint to a Postgresql Table, After It's Already Created
Composite VS Surrogate Keys for Referential Integrity in 6Nf