Sql: Between VS ≪= and ≫=

SQL : BETWEEN vs = and =

They are identical: BETWEEN is a shorthand for the longer syntax in the question that includes both values (EventDate >= '10/15/2009' and EventDate <= '10/19/2009').

Use an alternative longer syntax where BETWEEN doesn't work because one or both of the values should not be included e.g.

Select EventId,EventName from EventMaster
where EventDate >= '10/15/2009' and EventDate < '10/19/2009'

(Note < rather than <= in second condition.)

BETWEEN clause versus = AND =

There is no performance difference between the two example queries because BETWEEN is simply a shorthand way of expressing an inclusive range comparison. When Oracle parses the BETWEEN condition it will automatically expand out into separate comparison clauses:

ex.

SELECT *  
FROM table
WHERE column BETWEEN :lower_bound AND :upper_bound

...will automatically become:

SELECT *  
FROM table
WHERE :lower_bound <= column
AND :upper_bound >= column

BETWEEN operator vs. = AND =: Is there a performance difference?

No benefit, just a syntax sugar.

By using the BETWEEN version, you can avoid function reevaluation in some cases.

Compare performance difference of T-SQL Between and ' ' ' ' operator?

You can check this easily enough by checking the query plans in both situations. There is no difference of which I am aware. There is a logical difference though between BETWEEN and "<" and ">"... BETWEEN is inclusive. It's equivalent to "<=" and "=>".

Difference in SQL Between operator and = & = operator

Modern databases ship with very intelligent query execution optimisers. One of their main features is query transformation. Logically equivalent expressions can usually be transformed into each other. e.g. as Anthony suggested, the BETWEEN operator can be rewritten by Oracle (and MySQL) as two AND-connected comparisons, and vice versa, if BETWEEN isn't just implemented as syntactic sugar.

So even if there would be a difference (in performance), you can be assured that Oracle will very likely choose the better option.

This means that you can freely choose your preference, e.g. because of readability.

Note: it is not always obvious, what's logically equivalent. Query transformation rules become more complex when it comes to transforming EXISTS, IN, NOT EXISTS, NOT IN... But in this case, they are. For more details read the specification (chapter 8.3 between predicate):

http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

Is there a performance difference between BETWEEN and IN with MySQL or in SQL in general?

BETWEEN should outperform IN in this case (but do measure and check execution plans, too!), especially as n grows and as statistics are still accurate. Let's assume:

  • m is the size of your table
  • n is the size of your range

Index can be used (n is tiny compared to m)

  • In theory, BETWEEN can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at most n index leaf nodes. The complexity will be O(n + log m)

  • IN is usually implemented as a series (loop) of n "range scans" on the primary key index. With m being the size of the table, the complexity will always be O(n * log m) ... which is always worse (neglibile for very small tables m or very small ranges n)

Index cannot be used (n is a significant portion of m)

In any case, you'll get a full table scan and evaluate the predicate on each row:

  • BETWEEN needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity is O(m)

  • IN needs to evaluate at most n predicates. The complexity is O(m * n) ... which is again always worse, or perhaps O(m) if the database can optimise the IN list to be a hashmap, rather than a list of predicates.

SQL why use 'between' instead of ' = and ='

Sometimes using BETWEEN can save you from evaluating an operation more than once:

SELECT  AVG(RAND(20091225) BETWEEN 0.2 AND 0.4)
FROM t_source;

---
0.1998

SELECT AVG(RAND(20091225) >= 0.2 AND RAND(20091225) <= 0.4)
FROM t_source;

---
0.3199

In the first query RAND() is only called once, but in the second query RAND is called twice, here BETWEEN saves you a second function call to RAND.

There is also something to be said for readability in SQL queries, queries can become massive, functions like BETWEEN help improve this.

SQL: BETWEEN and IN (which is faster)

  • If your ids are always consecutive you should use BETWEEN.
  • If your ids may or may not be consecutive then use IN.

Performance shouldn't really be the deciding factor here. Having said that, BETWEEN seems to be faster in all examples that I have tested. For example:

Without indexes, checking a table with a million rows where every row has x = 1:


SELECT COUNT(*) FROM table1 WHERE x IN (1, 2, 3, 4, 5, 6);
Time taken: 0.55s

SELECT COUNT(*) FROM table1 WHERE x BETWEEN 1 AND 6;
Time taken: 0.54s

Without indexes, checking a table with a million rows where x has unique values:


SELECT COUNT(*) FROM table1 WHERE x IN (1, 2, 3, 4, 5, 6);
Time taken: 0.65s

SELECT COUNT(*) FROM table1 WHERE x BETWEEN 1 AND 6;
Time taken: 0.36s

A more realistic example though is that the id column is unique and indexed. When you do this the performance of both queries becomes close to instant.


SELECT COUNT(*) FROM table2 WHERE x IN (1, 2, 3, 4, 5, 6);
Time taken: 0.00s

SELECT COUNT(*) FROM table2 WHERE x BETWEEN 1 AND 6;
Time taken: 0.00s

So I'd say concentrate on writing a clear SQL statement rather than worrying about minor differences in execution speed. And make sure that the table is correctly indexed because that will make the biggest difference.

Note: Tests were performed on SQL Server Express 2008 R2. Results may be different on other systems.

SQL between not inclusive

It is inclusive. You are comparing datetimes to dates. The second date is interpreted as midnight when the day starts.

One way to fix this is:

SELECT *
FROM Cases
WHERE cast(created_at as date) BETWEEN '2013-05-01' AND '2013-05-01'

Another way to fix it is with explicit binary comparisons

SELECT *
FROM Cases
WHERE created_at >= '2013-05-01' AND created_at < '2013-05-02'

Aaron Bertrand has a long blog entry on dates (here), where he discusses this and other date issues.



Related Topics



Leave a reply



Submit