Why Is Query with Phone = N'1234' Slower Than Phone = '1234'

Why is query with phone = N'1234' slower than phone = '1234'?

Because nvarchar has higher datatype precedence than varchar so it needs to perform an implicit cast of the column to nvarchar and this prevents an index seek.

Under some collations it is able to still use a seek and just push the cast into a residual predicate against the rows matched by the seek (rather than needing to do this for every row in the entire table via a scan) but presumably you aren't using such a collation.

The effect of collation on this is illustrated below. When using the SQL collation you get a scan, for the Windows collation it calls the internal function GetRangeThroughConvert and is able to convert it into a seek.

CREATE TABLE [dbo].[phone]
(
phone1 VARCHAR(500) COLLATE sql_latin1_general_cp1_ci_as CONSTRAINT uq1 UNIQUE,
phone2 VARCHAR(500) COLLATE latin1_general_ci_as CONSTRAINT uq2 UNIQUE,
);

SELECT phone1 FROM [dbo].[phone] WHERE phone1 = N'5554474477';
SELECT phone2 FROM [dbo].[phone] WHERE phone2 = N'5554474477';

Sample Image

The SHOWPLAN_TEXT is below

Query 1

  |--Index Scan(OBJECT:([tempdb].[dbo].[phone].[uq1]),  WHERE:(CONVERT_IMPLICIT(nvarchar(500),[tempdb].[dbo].[phone].[phone1],0)=CONVERT_IMPLICIT(nvarchar(4000),[@1],0)))

Query 2

  |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1005], [Expr1006], [Expr1004]))
|--Compute Scalar(DEFINE:(([Expr1005],[Expr1006],[Expr1004])=GetRangeThroughConvert([@1],[@1],(62))))
| |--Constant Scan
|--Index Seek(OBJECT:([tempdb].[dbo].[phone].[uq2]), SEEK:([tempdb].[dbo].[phone].[phone2] > [Expr1005] AND [tempdb].[dbo].[phone].[phone2] < [Expr1006]), WHERE:(CONVERT_IMPLICIT(nvarchar(500),[tempdb].[dbo].[phone].[phone2],0)=[@1]) ORDERED FORWARD)

In the second case the compute scalar emits the following values

Expr1004 = 62
Expr1005 = '5554474477'
Expr1006 = '5554474478'

the seek predicate shown in the plan is on phone2 > Expr1005 and phone2 < Expr1006 so on the face of it would exclude '5554474477' but the flag 62 means that this does match.

NB: The range that the seek will cover will depend on the length of the string you are performing the equality on here.

For the predicate = N'a' for example the seek will still read the entire range of index values beginning with an a and have a residual predicate preserving only those matching = 'a'. The predicate = N'' is even worse. With the zero length prefix it ends up reading the whole index.

Fast search solution for numeric type of large mysql table?

You are storing numbers as INT, but querying then as CHAR (the LIKE operator implicitly converts INTs to CHARs) and it surely is not optimal. If you'd like to keep numbers as INT (probably the best idea in IO performance therms), you'd better change your queries to use numerical comparisons:

-- instead of CHAR operators 
WHERE phone_number LIKE '%4567'
WHERE phone_number LIKE '1234%'
-- use NUMERIC operators
WHERE phone_number % 10000 = 4567
WHERE phone_number >= 12340000 -- considering 8 digit numbers

Besides choosing a homogeneous way to store and query data, you should keep in mind to create the appropriate index CREATE INDEX IDX0 ON table (phone_number);.

Unfortunately, even then your query might not be optimal, because of effects similar to @ron have commented about. In this case you might have to tune your table to break this column into more manageable columns (like national_code, area_code and phone_number). This would allow an index efficient query by area-codes, for example.

Querying a Database with Phone Number Comparison

I finally worked out an answer to this in case anyone is interested. There isn't a way to build the phone number comparison into the database query - you need to parse all numbers beforehand and put them into a standard format (probably e164) using something like libphonenumber. Once they are all in a standard format in the database, querying against them is simple.

why is phone number not stored in integer datataype?

Because your number is out of range for an int type. You could try using bigint instead, or store the column as a varchar which is likely more appropriate as described here.

See this answer. An int is always stored with 4 bytes. I think you are misinterpreting what the (100) does, it just controls the display width.

More info on numeric types in the docs.

SQLite SELECT very slow on Windows Phone

SQLite can easily return 15,000 records from a simple table in a fraction of a second on a Windows Phone (tested on a Lumia 920).

There are other things causing your poor performance. If you have a huge number of columns, that might be a problem. Depending on how the SQLite wrapper is implemented (I don't know), two possible culprits are use of Reflection to fill your result objects or per-row Async overhead. But again, I don't know how that wrapper is implemented specifically.

The way to speed it up (other than to return less data) is to write your code in C++ and wrap it in a WinRT component to be called by your managed app.



Related Topics



Leave a reply



Submit