SQL Statements with Equals VS In

SQL statements with equals vs in

The two will produce the same execution plan - either a table scan, index scan, or index seek, depending on if/how you have your table indexed.

You can see for yourself - Displaying Graphical Execution Plans (SQL Server Management Studio) - See the section called "Using the Execution Plan Options".

Performance differences between equal (=) and IN with one literal value

There is no difference between those two statements, and the optimiser will transform the IN to the = when IN has just one element in it.

Though when you have a question like this, just run both statements, run their execution plan and see the differences. Here - you won't find any.

After a big search online, I found a document on SQL to support this (I assume it applies to all DBMS):

If there is only one value inside the parenthesis, this commend [sic] is equivalent to,

WHERE "column_name" = 'value1

Here is the execution plan of both queries in Oracle (most DBMS will process this the same):

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number = '123456789'

Plan hash value: 2312174735
-----------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
| 2 | INDEX UNIQUE SCAN | SYS_C0029838 |
-----------------------------------------------------

And for IN() :

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number in('123456789');

Plan hash value: 2312174735
-----------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
| 2 | INDEX UNIQUE SCAN | SYS_C0029838 |
-----------------------------------------------------

As you can see, both are identical. This is on an indexed column. Same goes for an unindexed column (just full table scan).

SQL - IN clause vs equals operator for small list

Both are same
SQL server converts this

where @TeamId in (Team1Id, Team2Id)

Into

where @TeamId=Team1Id or @TeamId=Team2Id

It's better to write IN compare to OR more readable and easy.

sql queries : IN vs equal

For one item it won't have an impact. I tested it on a table that has 25000 rows and about 15 columns (no large text columns).

If I use this:

EXPLAIN  
SELECT
*
FROM users
WHERE
user_id = 10104

The query plan is simple:

Index Scan using users_pkey on users  (cost=0.29..8.30 rows=1 width=147)
Index Cond: (user_id = 10104)

Since user_id is a primary key it will use it.

Now second query:

EXPLAIN  
SELECT
*
FROM users
WHERE
user_Id IN(10104)

Yields EXACTLY the same query plan because the database knows those are equal queries in the end.

I would advise you to run the same queries (with EXPLAIN) on your tables and have a look for yourself, that's because I don't know the structures of your tables, indexes, etc.

Run the same comparison for a query with more than one option, then things get interesting but since your question is about just one item you can sleep well knowing it will not have an impact.

Even if it had, in my opinion it would be so small that you shouldn't waste too much time thinking about what is the best practice here. Make sure your code is readable and well-written and that's already a step in the right direction

CASE Statement in where clause using equal to and IN

A CASE statement can't return a set of values... but this query should give you the same results:

WHERE CONDITION1='ABC'
AND Status =
CASE
WHEN 1 THEN 'True'
WHEN 2 THEN 'False'
WHEN 3 THEN NULL
WHEN 4 THEN Status
END

Also, note that unless you have ANSI_NULLS OFF, Status will never = NULL... you would need to use IS NULL for this comparison, and you'd need to forgo the CASE statement altogether.

MySQL Performance - IN Clause vs. Equals (=) for a Single Value

Neither of them really matter in the big scope of things. The network latency in communicating with the database will far outweigh either the count($object_ids) overhead or the = vs IN overhead. I would call this a case of premature optimization.

You should profile and load-test your application to learn where the real bottlenecks are.

IN vs OR in the SQL WHERE clause

I assume you want to know the performance difference between the following:

WHERE foo IN ('a', 'b', 'c')
WHERE foo = 'a' OR foo = 'b' OR foo = 'c'

According to the manual for MySQL if the values are constant IN sorts the list and then uses a binary search. I would imagine that OR evaluates them one by one in no particular order. So IN is faster in some circumstances.

The best way to know is to profile both on your database with your specific data to see which is faster.

I tried both on a MySQL with 1000000 rows. When the column is indexed there is no discernable difference in performance - both are nearly instant. When the column is not indexed I got these results:

SELECT COUNT(*) FROM t_inner WHERE val IN (1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000);
1 row fetched in 0.0032 (1.2679 seconds)

SELECT COUNT(*) FROM t_inner WHERE val = 1000 OR val = 2000 OR val = 3000 OR val = 4000 OR val = 5000 OR val = 6000 OR val = 7000 OR val = 8000 OR val = 9000;
1 row fetched in 0.0026 (1.7385 seconds)

So in this case the method using OR is about 30% slower. Adding more terms makes the difference larger. Results may vary on other databases and on other data.

Difference between NOT IN and equals vs. IN and not equals

NOT IN is going to give you the wrong results if id is nullable (which I hope it is not, otherwise it has a terrible name).

Why would you choose IN over EXISTS when it has been proven time and time again that EXISTS is more efficient (or at least no less efficient), since it can short-circuit? IN has to materialize the entire set.

SELECT * -- stop doing this
FROM dbo.usagerecords AS UR
WHERE EXISTS
(
SELECT 1 FROM dbo.pipelinerate AS pr
WHERE pr.id = ur.usagerateid
AND pr.name <> 'No Usage'
);

You can also express your other query like this:

SELECT * -- again, stop doing this
  FROM dbo.usagerecords AS UR
WHERE NOT EXISTS
(
SELECT 1 FROM dbo.pipelinerate AS pr
  WHERE pr.id = ur.usagerateid
  AND pr.name = 'No Usage'
);

But I have no idea which, if either, gets the correct results. This is why we typically ask for sample data and desired results.

Your use of SELECT * is likely to have a greater negative impact on performance than whether you use IN or EXISTS. FWIW.

Equals(=) vs. LIKE

Different Operators

LIKE and = are different operators. Most answers here focus on the wildcard support, which is not the only difference between these operators!

= is a comparison operator that operates on numbers and strings. When comparing strings, the comparison operator compares whole strings.

LIKE is a string operator that compares character by character.

To complicate matters, both operators use a collation which can have important effects on the result of the comparison.

Motivating Example

Let us first identify an example where these operators produce obviously different results. Allow me to quote from the MySQL manual:

Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:

mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+
| 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
+-----------------------------------------+
| 0 |
+-----------------------------------------+
mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+
| 'ä' = 'ae' COLLATE latin1_german2_ci |
+--------------------------------------+
| 1 |
+--------------------------------------+

Please note that this page of the MySQL manual is called String Comparison Functions, and = is not discussed, which implies that = is not strictly a string comparison function.

How Does = Work?

The SQL Standard § 8.2 describes how = compares strings:

The comparison of two character strings is determined as follows:

a) If the length in characters of X is not equal to the length
in characters of Y, then the shorter string is effectively
replaced, for the purposes of comparison, with a copy of
itself that has been extended to the length of the longer
string by concatenation on the right of one or more pad
characters, where the pad character is chosen based on CS. If
CS has the NO PAD attribute, then the pad character is an
implementation-dependent character different from any
character in the character set of X and Y that collates less
than any string under CS. Otherwise, the pad character is a
<space>.

b) The result of the comparison of X and Y is given by the
collating sequence CS.

c) Depending on the collating sequence, two strings may
compare as equal even if they are of different lengths or
contain different sequences of characters. When the operations
MAX, MIN, DISTINCT, references to a grouping column, and the
UNION, EXCEPT, and INTERSECT operators refer to character
strings, the specific value selected by these operations from
a set of such equal values is implementation-dependent.

(Emphasis added.)

What does this mean? It means that when comparing strings, the = operator is just a thin wrapper around the current collation. A collation is a library that has various rules for comparing strings. Here is an example of a binary collation from MySQL:

static int my_strnncoll_binary(const CHARSET_INFO *cs __attribute__((unused)),
const uchar *s, size_t slen,
const uchar *t, size_t tlen,
my_bool t_is_prefix)
{
size_t len= MY_MIN(slen,tlen);
int cmp= memcmp(s,t,len);
return cmp ? cmp : (int)((t_is_prefix ? len : slen) - tlen);
}

This particular collation happens to compare byte-by-byte (which is why it's called "binary" — it doesn't give any special meaning to strings). Other collations may provide more advanced comparisons.

For example, here is a UTF-8 collation that supports case-insensitive comparisons. The code is too long to paste here, but go to that link and read the body of my_strnncollsp_utf8mb4(). This collation can process multiple bytes at a time and it can apply various transforms (such as case insensitive comparison). The = operator is completely abstracted from the vagaries of the collation.

How Does LIKE Work?

The SQL Standard § 8.5 describes how LIKE compares strings:

The <predicate>

M LIKE P

is true if there exists a partitioning of M into substrings
such that:

i) A substring of M is a sequence of 0 or more contiguous
<character representation>s of M and each <character
representation> of M is part of exactly one substring.

ii) If the i-th substring specifier of P is an arbitrary
character specifier, the i-th substring of M is any single
<character representation>.

iii) If the i-th substring specifier of P is an arbitrary string
specifier, then the i-th substring of M is any sequence of
0 or more <character representation>s.

iv) If the i-th substring specifier of P is neither an
arbitrary character specifier nor an arbitrary string specifier,
then the i-th substring of M is equal to that substring
specifier according to the collating sequence of
the <like predicate>, without the appending of <space>
characters to M, and has the same length as that substring
specifier.

v) The number of substrings of M is equal to the number of
substring specifiers of P.

(Emphasis added.)

This is pretty wordy, so let's break it down. Items ii and iii refer to the wildcards _ and %, respectively. If P does not contain any wildcards, then only item iv applies. This is the case of interest posed by the OP.

In this case, it compares each "substring" (individual characters) in M against each substring in P using the current collation.

Conclusions

The bottom line is that when comparing strings, = compares the entire string while LIKE compares one character at a time. Both comparisons use the current collation. This difference leads to different results in some cases, as evidenced in the first example in this post.

Which one should you use? Nobody can tell you that — you need to use the one that's correct for your use case. Don't prematurely optimize by switching comparison operators.



Related Topics



Leave a reply



Submit