Equals(=) Vs. Like

Equals(=) vs. LIKE

Different Operators

LIKE and = are different operators. Most answers here focus on the wildcard support, which is not the only difference between these operators!

= is a comparison operator that operates on numbers and strings. When comparing strings, the comparison operator compares whole strings.

LIKE is a string operator that compares character by character.

To complicate matters, both operators use a collation which can have important effects on the result of the comparison.

Motivating Example

Let us first identify an example where these operators produce obviously different results. Allow me to quote from the MySQL manual:

Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:

mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+
| 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
+-----------------------------------------+
| 0 |
+-----------------------------------------+
mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+
| 'ä' = 'ae' COLLATE latin1_german2_ci |
+--------------------------------------+
| 1 |
+--------------------------------------+

Please note that this page of the MySQL manual is called String Comparison Functions, and = is not discussed, which implies that = is not strictly a string comparison function.

How Does = Work?

The SQL Standard § 8.2 describes how = compares strings:

The comparison of two character strings is determined as follows:

a) If the length in characters of X is not equal to the length
in characters of Y, then the shorter string is effectively
replaced, for the purposes of comparison, with a copy of
itself that has been extended to the length of the longer
string by concatenation on the right of one or more pad
characters, where the pad character is chosen based on CS. If
CS has the NO PAD attribute, then the pad character is an
implementation-dependent character different from any
character in the character set of X and Y that collates less
than any string under CS. Otherwise, the pad character is a
<space>.

b) The result of the comparison of X and Y is given by the
collating sequence CS.

c) Depending on the collating sequence, two strings may
compare as equal even if they are of different lengths or
contain different sequences of characters. When the operations
MAX, MIN, DISTINCT, references to a grouping column, and the
UNION, EXCEPT, and INTERSECT operators refer to character
strings, the specific value selected by these operations from
a set of such equal values is implementation-dependent.

(Emphasis added.)

What does this mean? It means that when comparing strings, the = operator is just a thin wrapper around the current collation. A collation is a library that has various rules for comparing strings. Here is an example of a binary collation from MySQL:

static int my_strnncoll_binary(const CHARSET_INFO *cs __attribute__((unused)),
const uchar *s, size_t slen,
const uchar *t, size_t tlen,
my_bool t_is_prefix)
{
size_t len= MY_MIN(slen,tlen);
int cmp= memcmp(s,t,len);
return cmp ? cmp : (int)((t_is_prefix ? len : slen) - tlen);
}

This particular collation happens to compare byte-by-byte (which is why it's called "binary" — it doesn't give any special meaning to strings). Other collations may provide more advanced comparisons.

For example, here is a UTF-8 collation that supports case-insensitive comparisons. The code is too long to paste here, but go to that link and read the body of my_strnncollsp_utf8mb4(). This collation can process multiple bytes at a time and it can apply various transforms (such as case insensitive comparison). The = operator is completely abstracted from the vagaries of the collation.

How Does LIKE Work?

The SQL Standard § 8.5 describes how LIKE compares strings:

The <predicate>

M LIKE P

is true if there exists a partitioning of M into substrings
such that:

i) A substring of M is a sequence of 0 or more contiguous
<character representation>s of M and each <character
representation> of M is part of exactly one substring.

ii) If the i-th substring specifier of P is an arbitrary
character specifier, the i-th substring of M is any single
<character representation>.

iii) If the i-th substring specifier of P is an arbitrary string
specifier, then the i-th substring of M is any sequence of
0 or more <character representation>s.

iv) If the i-th substring specifier of P is neither an
arbitrary character specifier nor an arbitrary string specifier,
then the i-th substring of M is equal to that substring
specifier according to the collating sequence of
the <like predicate>, without the appending of <space>
characters to M, and has the same length as that substring
specifier.

v) The number of substrings of M is equal to the number of
substring specifiers of P.

(Emphasis added.)

This is pretty wordy, so let's break it down. Items ii and iii refer to the wildcards _ and %, respectively. If P does not contain any wildcards, then only item iv applies. This is the case of interest posed by the OP.

In this case, it compares each "substring" (individual characters) in M against each substring in P using the current collation.

Conclusions

The bottom line is that when comparing strings, = compares the entire string while LIKE compares one character at a time. Both comparisons use the current collation. This difference leads to different results in some cases, as evidenced in the first example in this post.

Which one should you use? Nobody can tell you that — you need to use the one that's correct for your use case. Don't prematurely optimize by switching comparison operators.

What's the difference between LIKE and = in SQL?

As per SQL standard, the difference is treatment of trailing whitespace in CHAR columns. Example:

create table t1 ( c10 char(10) );
insert into t1 values ('davyjones');

select * from t1 where c10 = 'davyjones';
-- yields 1 row

select * from t1 where c10 like 'davyjones';
-- yields 0 rows

Of course, assuming you run this on a standard-compliant DBMS. BTW, this is one the main differences between CHARs and VARCHARs.

Equals(=) vs. LIKE for date data type

Assuming LAST_TRANSACTION_DATE is a DATE column (or TIMESTAMP) then both version are very bad practice.

In both cases the DATE column will implicitly be converted to a character literal based on the current NLS settings. That means with different clients you will get different results.

When using date literals always use to_date() with(!) a format mask or use an ANSI date literal. That way you compare dates with dates not strings with strings. So for the equal comparison you should use:

LAST_TRANSACTION_DATE = to_date('30-JUL-07', 'dd-mon-yy')

Note that using 'MON' can still lead to errors with different NLS settings ('DEC' vs. 'DEZ' or 'MAR' vs. 'MRZ'). It is much less error prone using month numbers (and four digit years):

LAST_TRANSACTION_DATE = to_date('30-07-2007', 'dd-mm-yyyy')

or using an ANSI date literal

LAST_TRANSACTION_DATE = DATE '2007-07-30'

Now the reason why the above query is very likely to return nothing is that in Oracle DATE columns include the time as well. The above date literals implicitly contain the time 00:00. If the time in the table is different (e.g. 19:54) then of course the dates are not equal.

To workaround this problem you have different options:

  1. use trunc() on the table column to "normalize" the time to 00:00
    trunc(LAST_TRANSACTION_DATE) = DATE '2007-07-30
    this will however prevent the usage of an index defined on LAST_TRANSACTION_DATE
  2. use between

    LAST_TRANSACTION_DATE between to_date('2007-07-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss') and to_date('2007-07-30 23:59:59', 'yyyy-mm-dd hh24:mi:ss')

The performance problem of the first solution could be worked around by creating an index on trunc(LAST_TRANSACTION_DATE) which could be used by that expression. But the expression LAST_TRANSACTION_DATE = '30-JUL-07' prevents an index usage as well because internally it's processed as to_char(LAST_TRANSACTION_DATE) = '30-JUL-07'

The important things to remember:

  1. Never, ever rely on implicit data type conversion. It will give you problems at some point. Always compare the correct data types
  2. Oracle DATE columns always contain a time which is part of the comparison rules.

Query Form Logic: Like vs. Equals

This is something I would leave up to the user, allowing then to actually make a choice. All the UIs I've seen for allowing user-specified conditions have:

  • the column to check.
  • a drop-down box containing the relationship, such as equal to, not equal to, less than, greater than, starts with.
  • the value you want to compare to.

Then, for the starts with option, you just tack on % and use like.

You'll note (for performance reasons which you seem to already understand) I used starts with rather than like to limit the possibility of dragging down the database performance.

I'm not a big fan of unrestricted like statements although you could also provide ends with for those DBMS' capable of storing reversed indexes.

SQL 'like' vs '=' performance

See https://web.archive.org/web/20150209022016/http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx

Quote from there:

the rules for index usage with LIKE
are loosely like this:

  • If your filter criteria uses equals =
    and the field is indexed, then most
    likely it will use an INDEX/CLUSTERED
    INDEX SEEK

  • If your filter criteria uses LIKE,
    with no wildcards (like if you had a
    parameter in a web report that COULD
    have a % but you instead use the full
    string), it is about as likely as #1
    to use the index. The increased cost
    is almost nothing.

  • If your filter criteria uses LIKE, but
    with a wildcard at the beginning (as
    in Name0 LIKE '%UTER') it's much less
    likely to use the index, but it still
    may at least perform an INDEX SCAN on
    a full or partial range of the index.

  • HOWEVER, if your filter criteria uses
    LIKE, but starts with a STRING FIRST
    and has wildcards somewhere AFTER that
    (as in Name0 LIKE 'COMP%ER'), then SQL
    may just use an INDEX SEEK to quickly
    find rows that have the same first
    starting characters, and then look
    through those rows for an exact match.


(Also keep in mind, the SQL engine
still might not use an index the way
you're expecting, depending on what
else is going on in your query and
what tables you're joining to. The
SQL engine reserves the right to
rewrite your query a little to get the
data in a way that it thinks is most
efficient and that may include an
INDEX SCAN instead of an INDEX SEEK)

Difference between SQL LIKE without percent signs and equal (=) in WHERE clause

In practice, LIKE with no wildcards is functionally equivalent to =. However, they are not the same! The obvious difference is that = doesn't treat \, %, and _ in any special way, but LIKE does.

The documentation is pretty clear on this:

Per the SQL standard, LIKE performs matching on a per-character basis,
thus it can produce results different from the = comparison operator:

In addition to collation differences, trailing spaces matter:

In particular, trailing spaces are significant, which is not true for
CHAR or VARCHAR comparisons performed with the = operator:

In practice, the strings being compared usually have the same collation, don't have trailing spaces, and special characters are ignored, so LIKE is sometimes used as a replacement for = (especially because LIKE without wildcards at the beginning of the pattern can also make use of an index).

What is the difference between == and equals() in Java?

In general, the answer to your question is "yes", but...

  • .equals(...) will only compare what it is written to compare, no more, no less.
  • If a class does not override the equals method, then it defaults to the equals(Object o) method of the closest parent class that has overridden this method.
  • If no parent classes have provided an override, then it defaults to the method from the ultimate parent class, Object, and so you're left with the Object#equals(Object o) method. Per the Object API this is the same as ==; that is, it returns true if and only if both variables refer to the same object, if their references are one and the same. Thus you will be testing for object equality and not functional equality.
  • Always remember to override hashCode if you override equals so as not to "break the contract". As per the API, the result returned from the hashCode() method for two objects must be the same if their equals methods show that they are equivalent. The converse is not necessarily true.

Can I combine like and equal to get data?

Since your datatype is CHAR, Gordon's answer is not working for you. CHAR adds trailing spaces for the strings less than maximum limit. You could use TRIM to fix this as shown. But, you should preferably store numbers in the NUMBER type and not CHAR or VARCHAR2, which will create other problems sooner or later.

select *
from data
where trim(ID) like '12345_____';

Which equals operator (== vs ===) should be used in JavaScript comparisons?

The strict equality operator (===) behaves identically to the abstract equality operator (==) except no type conversion is done, and the types must be the same to be considered equal.

Reference: Javascript Tutorial: Comparison Operators

The == operator will compare for equality after doing any necessary type conversions. The === operator will not do the conversion, so if two values are not the same type === will simply return false. Both are equally quick.

To quote Douglas Crockford's excellent JavaScript: The Good Parts,

JavaScript has two sets of equality operators: === and !==, and their evil twins == and !=. The good ones work the way you would expect. If the two operands are of the same type and have the same value, then === produces true and !== produces false. The evil twins do the right thing when the operands are of the same type, but if they are of different types, they attempt to coerce the values. the rules by which they do that are complicated and unmemorable. These are some of the interesting cases:

'' == '0'           // false
0 == '' // true
0 == '0' // true

false == 'false' // false
false == '0' // true

false == undefined // false
false == null // false
null == undefined // true

' \t\r\n ' == 0 // true

Equality Comparison Table

The lack of transitivity is alarming. My advice is to never use the evil twins. Instead, always use === and !==. All of the comparisons just shown produce false with the === operator.


Update:

A good point was brought up by @Casebash in the comments and in @Phillipe Laybaert's answer concerning objects. For objects, == and === act consistently with one another (except in a special case).

var a = [1,2,3];
var b = [1,2,3];

var c = { x: 1, y: 2 };
var d = { x: 1, y: 2 };

var e = "text";
var f = "te" + "xt";

a == b // false
a === b // false

c == d // false
c === d // false

e == f // true
e === f // true

The special case is when you compare a primitive with an object that evaluates to the same primitive, due to its toString or valueOf method. For example, consider the comparison of a string primitive with a string object created using the String constructor.

"abc" == new String("abc")    // true
"abc" === new String("abc") // false

Here the == operator is checking the values of the two objects and returning true, but the === is seeing that they're not the same type and returning false. Which one is correct? That really depends on what you're trying to compare. My advice is to bypass the question entirely and just don't use the String constructor to create string objects from string literals.

Reference

http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.3

Performance differences between equal (=) and IN with one literal value

There is no difference between those two statements, and the optimiser will transform the IN to the = when IN has just one element in it.

Though when you have a question like this, just run both statements, run their execution plan and see the differences. Here - you won't find any.

After a big search online, I found a document on SQL to support this (I assume it applies to all DBMS):

If there is only one value inside the parenthesis, this commend [sic] is equivalent to,

WHERE "column_name" = 'value1

Here is the execution plan of both queries in Oracle (most DBMS will process this the same):

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number = '123456789'

Plan hash value: 2312174735
-----------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
| 2 | INDEX UNIQUE SCAN | SYS_C0029838 |
-----------------------------------------------------

And for IN() :

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number in('123456789');

Plan hash value: 2312174735
-----------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
| 2 | INDEX UNIQUE SCAN | SYS_C0029838 |
-----------------------------------------------------

As you can see, both are identical. This is on an indexed column. Same goes for an unindexed column (just full table scan).



Related Topics



Leave a reply



Submit