How about the performance between PATINDEX and CHARINDEX?
CHARINDEX has limit of 8000 for the string you are looking for. Not for the string you are searching.
That article is probably wrong unless you want to look for strings > 8000/4000 length. However, it also uses text datatype which is deprecated. It's dated 2007, so it would be SQL Server 2005 or 2000 he used but SQL Server 2005 BOL does not mention 8000 at all and it's not something I've tried.
From the SQL Server 2008 BOL:
CHARINDEX ( expression1 ,expression2 [
, start_location ] )expression1 Is a character expression that
contains the sequence to be found.
expression1 is limited to 8000
characters.expression2 Is a character expression to be searched.
Note: PATINDEX does not mention 8000 limit for SQL 2008 or 2005.
Finally, I'd use CHARINDEX because I think it's more intuitive for straightforward searches if you don't need pattern matching and it supports long strings
Performance optimized way to test whether a varchar field has any digit (SQL Server 2012)
You can query using a ranged pattern expression:
To Query for records with a digit:
where fieldName LIKE '%[0-9]%'
To Query for records without a digit:
where fieldName NOT LIKE '%[0-9]%'
See here: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/like-transact-sql?view=sql-server-2017#arguments for more information.
T-SQL speed comparison between LEFT() vs. LIKE operator
Your best bet would be to measure the performance on real production data rather than trying to guess (or ask us). That's because performance can sometimes depend on the data you're processing, although in this case it seems unlikely (but I don't know that, hence why you should check).
If this is a query you will be doing a lot, you should consider another (indexed) column which contains the lowercased first letter of name
and have it set by an insert/update trigger.
This will, at the cost of a minimal storage increase, make this query blindingly fast:
select * from table where name_first_char_lower = @firstletter
That's because most database are read far more often than written, and this will amortise the cost of the calculation (done only for writes) across all reads.
It introduces redundant data but it's okay to do that for performance as long as you understand (and mitigate, as in this suggestion) the consequences and need the extra performance.
CHARINDEX vs LIKE search gives very different performance, why?
I will answer my own question since it was hard to find correct answer and I was pointed to the problem by SQL Server 2012 Execution Plan output. As you see in original question - everything looks OK on surface. This is SQL Server 2008.
When I run same query on 2012 I got warning on CHARINDEX
query. Problem is - SQL Server had to do type conversion. Address1
is VarChar
and query has N'1124' which is Unicode or NVarChar
. If I change this query as so:
SELECT *
FROM LOCAddress
WHERE (CAST(CHARINDEX(LOWER('1124'), LOWER([Address1])) AS int))
It then runs same as LIKE
query. So, type conversion that was caused by Entity Framework generator was causing this horrible hit in performance.
Linq: SqlFunctions.PatIndex vs string.Contains for string comparisson
I believe version 1 runs faster theoretically.
Reasons:
- Both versions translate into SQL queries eventually.
- Version 1 translates into a query with where clause having 'like' operator
- Version 2 translate into a query with where clause having 'PatIndex' function.
- We know function in SQL will take longer time to return results when comparing with pure 'like' operator in large data set.
Using PatIndex and CharIndex like the InStr function
You've got it reversed.
select patindex('%.com%', 'http://www.stackoverflow.com');
select charindex('.com', 'http://www.stackoverflow.com');
Simple Explanation for PATINDEX
Are you asking for realistic use-cases? I can think of two, real-life use-cases that I've had at work where PATINDEX()
was my best option.
I had to import a text-file and parse it for INSERT INTO
later on. But these files sometimes had numbers in this format: 00000-59
. If you try CAST('00000-59' AS INT)
you'll get an error. So I needed code that would parse 00000-59
to -59
but also 00000159
to 159
etc. The -
could be anywhere, or it could simply not be there at all. This is what I did:
DECLARE @my_var VARCHAR(255) = '00000-59', @my_int INT
SET @my_var = STUFF(@my_var, 1, PATINDEX('%[^0]%', @my_var)-1, '')
SET @my_int = CAST(@my_var AS INT)
[^0]
in this case means "any character that isn't a 0
". So PATINDEX()
tells me when the 0's end, regardless of whether that's because of a -
or a number.
The second use-case I've had was checking whether an IBAN number was correct. In order to do that, any letters in the IBAN need to be changed to a corresponding number (A=10, B=11, etc...). I did something like this (incomplete but you get the idea):
SET @i = PATINDEX('%[^0-9]%', @IBAN)
WHILE @i <> 0 BEGIN
SET @num = UNICODE(SUBSTRING(@IBAN, @i, 1))-55
SET @IBAN = STUFF(@IBAN, @i, 1, CAST(@num AS VARCHAR(2))
SET @i = PATINDEX('%[^0-9]%', @IBAN)
END
So again, I'm not concerned with finding exactly the letter A
or B
etc. I'm just finding anything that isn't a number and converting it.
Performance of SUBSTRING vs LEFT in SQL Server
SQL Server is a database. You dod not ask questions of which string processing function is 'faster'. You ask the questions 'which can use an index?' and 'do I have the required index?'. Is all about data access, because disks are sloooooow, not about shifting CPU registers.
So, Which can use an index? (which one is sargable?). In theory LEFT
could use an index, but in practice it usually does not. SUBSTRING
cannot. Instead of SUBSTRING
use Full Text.
Design your data model to take advantage of sargable expressions, index accordingly. That's all there is to it, there is no magic bullet. Avoid scans.
Related Topics
SQL Server Management Studio - How to Change a Field Type Without Dropping Table
SQL Server as Statement Aliased Column Within Where Statement
Index Autoincrement for Microsoft SQL Server 2008 R2
Execution Order of Conditions in SQL 'Where' Clause
For Autoincrement Fields: Max(Id) VS Top 1 Id Order by Id Desc
Left Outer Join and an Additional Where Clause
Searching a Column Containing CSV Data in a MySQL Table for Existence of Input Values
Optional Where Clause Jasper Reports
Determine Latest Row Added When No Index Is Present
Deleting Duplicates Rows from Redshift
How to Set Isolation Level on SQLcommand/Sqlconnection Initialized with No Transaction
Convert Row Value in to Column in SQL Server (Pivot)
Update and Select in One Query