Performance of Querying for a String That Starts and Ends with Something

Performance of querying for a string that starts and ends with something

The SQL optimizer will look for an index on column in the second example, and will narrow it down to the records beginning with the characters before the wildcard. In the first example, it can't narrow it down, so you'll like scan either the index or table (depending on index structure).

Which SQL product are you using?

Does long query string affect the speed?

I would prefer to use IN in this case as it would be better. However to check the performance you may try to look at the Execution Plan of the query which you are executing. You will get the idea about what performance difference you will get by using the both.

Something like this:

SELECT id from users where collegeid IN ('1','2','3'....,'1000')

According to the MYSQL

If all values are constants, they are evaluated according to the type
of expr and sorted. The search for the item then is done using a
binary search. This means IN is very quick if the IN value list
consists entirely of constants.

The number of values in the IN list is only limited by the
max_allowed_packet value.

You may also check IN vs OR in the SQL WHERE Clause and MYSQL OR vs IN performance

The answer given by Ergec is very useful:

SELECT * FROM item WHERE id = 1 OR id = 2 ... id = 10000

This query took 0.1239 seconds

SELECT * FROM item WHERE id IN (1,2,3,...10000)

This query took 0.0433 seconds

IN is 3 times faster than OR

will it affect the speed or output in any way?

So the answer is Yes the performance will be affected.

Performance of RegEx vs LIKE in MySql queries

It is possible that it could be faster because the LIKE condition can be evaluated more quickly then the regular expression so if most rows fail the test it could be faster. However it will be slower if most rows succeed as two tests must be run for successful rows instead of just one. It also depends on which expression the optimizer chooses to run first.

An even bigger speedup can be witnessed if you have something like this:

SELECT * FROM (
SELECT * FROM lineage_string
WHERE lineage LIKE '179%'
) WHERE lineage regexp '^179(/|$)'

Now an index can be used to find likely rows because LIKE '179%' is sargable. Many rows won't need to be checked at all.

As always the best way to be sure is to measure it for yourself on your actual data.

SQL 'like' vs '=' performance

See https://web.archive.org/web/20150209022016/http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx

Quote from there:

the rules for index usage with LIKE
are loosely like this:

  • If your filter criteria uses equals =
    and the field is indexed, then most
    likely it will use an INDEX/CLUSTERED
    INDEX SEEK

  • If your filter criteria uses LIKE,
    with no wildcards (like if you had a
    parameter in a web report that COULD
    have a % but you instead use the full
    string), it is about as likely as #1
    to use the index. The increased cost
    is almost nothing.

  • If your filter criteria uses LIKE, but
    with a wildcard at the beginning (as
    in Name0 LIKE '%UTER') it's much less
    likely to use the index, but it still
    may at least perform an INDEX SCAN on
    a full or partial range of the index.

  • HOWEVER, if your filter criteria uses
    LIKE, but starts with a STRING FIRST
    and has wildcards somewhere AFTER that
    (as in Name0 LIKE 'COMP%ER'), then SQL
    may just use an INDEX SEEK to quickly
    find rows that have the same first
    starting characters, and then look
    through those rows for an exact match.


(Also keep in mind, the SQL engine
still might not use an index the way
you're expecting, depending on what
else is going on in your query and
what tables you're joining to. The
SQL engine reserves the right to
rewrite your query a little to get the
data in a way that it thinks is most
efficient and that may include an
INDEX SCAN instead of an INDEX SEEK)

Any performance impact in Oracle for using LIKE 'string' vs = 'string'?

There is a clear difference when you use bind variables, which you should be using in Oracle for anything other than data warehousing or other bulk data operations.

Take the case of:

SELECT * FROM SOME_TABLE WHERE SOME_FIELD LIKE :b1

Oracle cannot know that the value of :b1 is '%some_value%', or 'some_value' etc. until execution time, so it will make an estimation of the cardinality of the result based on heuristics and come up with an appropriate plan that either may or may not be suitable for various values of :b, such as '%A','%', 'A' etc.

Similar issues can apply with an equality predicate but the range of cardinalities that might result is much more easily estimated based on column statistics or the presence of a unique constraint, for example.

So, personally I wouldn't start using LIKE as a replacement for =. The optimizer is pretty easy to fool sometimes.

How to Increase SQL Query Performance on where clauses searching concatenated values

When you concatenate the string you are preventing the SQL optimizer to use the existing index on MainTable (USERNAME). That forces the engine to follow a different [slower] path; probably a HEAP [TABLE] SCAN. As simple as that.

If you really need to provide the full email address I would compute the concatenation in the last step and not before, essentially going back to your first option. For example:

Select USERNAME || '@domain.com', RCD From
(
With Exmp1 AS
(
Select ID, RCD From Table1 a where EFFDT = (Select Max(b.EFFDT)
FROM Table1 b
Where a.ID = b.ID and a.RCD = b.RCD) and status = 'A'
)

Select USERNAME, RCD
From MainTable MT Inner Join Exmp1 E1 ON MT.ID = E1.ID

)
Where USERNAME = 'test1'

EDIT:

Taking the idea one step further you can rephrase the whole query and find out which optimizations are easily visible once the query is simplified:

  • The thing is the column MT.USERNAME is probably much more selective than a.STATUS, so you should filter by it first.
  • Then, to make the correlated subquery fast, you probably want to use a "covering index" on it, so I suggest adding ix2 as shown below.

For example:

Select
MT.USERNAME || '@domain.com', a.RCD
From MainTable MT
join Table1 a on a.ID = MT.ID
where MT.USERNAME = 'test1'
and a.status = 'A'
and a.EFFDT = (
Select Max(b.EFFDT) FROM Table1 b Where a.ID = b.ID and a.RCD = b.RCD
)

Now, in order for this query to be real fast you'll need the following indexes. It seems you already have the first one:

create index ix1 on MainTable (USERNAME); -- You already have this one
create index ix2 on Table1 (ID, RCD, EFFDT);

SECOND EDIT: If you really want to search using the full username you can add an index on an expression. Take your "Example 2" and change the WHERE condition as shown below:

Select * From
(
With Exmp1 AS
(
Select ID, RCD From Table1 a where EFFDT = (Select Max(b.EFFDT)
FROM Table1 b
Where a.ID = b.ID and a.RCD = b.RCD) and status = 'A'
)

Select USERNAME || '@domain.com', RCD
From MainTable MT Inner Join Exmp1 E1 ON MT.ID = E1.ID

)
Where USERNAME || '@domain.com' = 'test1@domain.com' -- changed here

Then add the following index:

create index ix3 on MainTable (USERNAME || '@domain.com');

This should make the query fast, since the filtering preficate will be an exact match with the index.



Related Topics



Leave a reply



Submit