SQL Server Index - Any improvement for LIKE queries?
Only if you add full-text searching to those columns, and use the full-text query capabilities of SQL Server.
Otherwise, no, an index will not help.
Does Adding Indexes speed up String Wildcard % searches?
Creating a normal index will not help(*), but a full-text index will, though you would have to change your query to something like this:
select * from dbo.Product where ProductName CONTAINS 'furniture'
(* -- well, it can be slightly helpful, in that it can reduce a scan over every row and column in your table into a scan over merely every row and only the relevant columns. However, it will not achieve the orders of magnitude performance boost that we normally expect from indexes that turn scans into single seeks.)
How to make LIKE '%Search% faster in SQL Server
You are right... queries with a leading wildcard are awful for performance. To get around this, Sql Server has something called full text search. You create a special FULL TEXT Index for each of the columns you want to search, and then update your code to use the CONTAINS keyword:
SELECT
p.CrmId,
park.Name
from Property p
inner join Som som on som.CrmId = p.SystemOfMeasurementId
left join Park park on park.CrmId = p.ParkId
WHERE
(
Contains(p.City, @search)
or Contains(p.Address1, @search)
or Contains(p.Address2, @search)
or Contains(p.State, @search)
or Contains(park.Name, @search)
or Contains(p.ZipCode, @search)
)
AND (@usOnly = 0 or (p.CrmCountryId = @USA_COUNTRY_ID))
Unfortunately, all those OR conditions are still likely to make this pretty slow, and FULL TEXT wasn't intended as much for shorter strings like City or State, or for casting wide nets like this. You may find you'll do much better for this kind of search by integrating with a tool like Solr or ElasticSearch. In addition to writing a better and faster search, these tools will help you create sane rankings for returning results in an order that makes sense and is relevant to the input.
Another strategy it to create a computed column that concatenates your address and name text into a single column, and then do a single FULL TEXT index on that one field, with a single CONTAINS() call.
Improve performance of SQL Query with dynamic like
Take a look at my answer about using LIKE
operator here
It could be quite performant if you use some tricks
You can gain much speed if you play with collation, try this:
SELECT DISTINCT TOP 10 p.[Id], n.[LastName], n.[FirstName]
FROM [dbo].[people] p
INNER JOIN [dbo].[people_NAME] n on n.[Id] = p.[Id]
WHERE EXISTS (
SELECT 'x' x
FROM [dbo].[people_NAME] n2
WHERE n2.[Id] != p.[id]
AND
lower(n2.[FirstName]) collate latin1_general_bin
LIKE
'%' + lower(n1.[FirstName]) + '%' collate latin1_general_bin
)
As you can see we are using binary comparision instead of string comparision and this is much more performant.
Pay attention, you are working with people's names, so you can have issues with special unicode characters or strange accents.. etc.. etc..
Normally the EXISTS
clause is better than INNER JOIN
but you are using also a DISTINCT
that is a GROUP BY
on all columns.. so why not to use this?
You can switch to INNER JOIN
and use the GROUP BY
instead of the DISTINCT
so testing COUNT(*)>1
will be (very little) more performant than testing WHERE n2.[Id] != p.[id]
, especially if your TOP clause is extracting many rows.
Try this:
SELECT TOP 10 p.[Id], n.[LastName], n.[FirstName]
FROM [dbo].[people] p
INNER JOIN [dbo].[people_NAME] n on n.[Id] = p.[Id]
INNER JOIN [dbo].[people_NAME] n2 on
lower(n2.[FirstName]) collate latin1_general_bin
LIKE
'%' + lower(n1.[FirstName]) + '%' collate latin1_general_bin
GROUP BY n1.[Id], n1.[FirstName]
HAVING COUNT(*)>1
Here we are matching also the name itself, so we will find at least one match for each name.
But We need only names that matches other names, so we will keep only rows with match count greater than one (count(*)=1 means that name match only with itself).
EDIT: I did all test using a random names table with 100000 rows and found that in this scenario, normal usage of LIKE operator is about three times worse than binary comparision.
Recommended index and query improvement
First of all I'm going to say that you should follow some tips for optimizing query execution time besides implementing a correct index strategy.
- Avoid functions in the inner SELECT and JOIN statements. Functions (even when cached) should be executed for the lowest amount possible of records and, usually, this happens in the outermost select.
- Avoid subqueries when possible, chose JOIN instead.
- Avoid using non numeric fields in the where statements when possible, an index scan on an INT field is much much faster than on a VARCHAR.
- Avoid using the WITH(NOLOCK) hint since you will also read uncommitted data. It doesn't make the query go faster and you'll have a potential dirty dataset.
When trying to optimize a query also keep in mind the order of operation that the query "interpreter" uses to parse it:
- FROM and JOIN BLOCK
- GROUP BY AND HAVING
- WHERE
- SELECT
So try to write your query to reduce the number or records returned by each of this block in THIS order.
That being said, an INDEX must be created according to the query that uses and you can find an helpful hint if you test a query execution with the execution plan included, often SSMS helps you a lot.
In this case I'd add an index on the URL and TimeStamp fields, in that order
CREATE CLUSTERED INDEX idx_Log ON yourDatabase.dbo.[log] (URL, Timestamp)
Why this index doesn't improve query performance
Since you're fetching most of the rows in the tables, the indexes have to be covering (=contain every column you need in your query from that table) to help you at all -- and that improvement might not be much.
The reason the indexes don't really help is that you're reading most of the rows, and you have IrreleventField
s in your query. Since the index contains only the index key + clustered key, the rest of the fields must be fetched from the table (=clustered index) using the clustered index key. That's called key lookup and can be very costly, because it has to be done for every single row found from the index that matches your search criteria.
For the index being covered, you can add the "irrelevant" fields into include part of the index, if you want to try if it improves the situation.
Related Topics
Oracle SQL Clause Evaluation Order
String or Binary Data Would Be Truncated. the Statement Has Been Terminated
How to Add Time to Datetime in SQL
Export from SQL Server 2012 to .CSV Through Management Studio
SQL Server 2008 Query to Find Rows Containing Non-Alphanumeric Characters in a Column
After Installing SQL Server 2014 Express Can't Find Local Db
How to Write Subquery Inside the Outer Join Statement
Postgresql 9.1: How to Concatenate Rows in Array Without Duplicates, Join Another Table
How to Rollback an Update Query in SQL Server 2005
Do You Put Your Database Static Data into Source-Control? How
Postgresql: Between with Datetime
Using MySQL in Clause as All Inclusive (And Instead of Or)
Update Existing Database Values from Spreadsheet
How to Gracefully Include Formatted SQL Strings in an R Script
How to List Records with Date from the Last 10 Days
Modify Table: How to Change 'Allow Nulls' Attribute from Not Null to Allow Null
How to Get Node Name and Values from an Xml Variable in T-Sql