SQL Server Union - What Is the Default Order by Behaviour

SQL Server UNION - What is the default ORDER BY Behaviour

There is no default order.

Without an Order By clause the order returned is undefined. That means SQL Server can bring them back in any order it likes.

EDIT:
Based on what I have seen, without an Order By, the order that the results come back in depends on the query plan. So if there is an index that it is using, the result may come back in that order but again there is no guarantee.

Does UNION ALL guarantee the order of the result set

There is no inherent order, you have to use ORDER BY. For your example you can easily do this by adding a SortOrder to each SELECT. This will then keep the records in the order you want:

SELECT 'O', 1 SortOrder
UNION ALL
SELECT 'R', 2
UNION ALL
SELECT 'D', 3
UNION ALL
SELECT 'E', 4
UNION ALL
SELECT 'R', 5
ORDER BY SortOrder

You cannot guarantee the order unless you specifically provide an order by with the query.

Strange Order By behaviour with Union

You need parentheses in the first query, because the order by is applied to the result of the union. It is interpreted as:

(select category,timefield from test where category='A'
union
select top 10 category,timefield from test where category='B'
)
order by timefield desc

(I'm not saying this is valid syntax.)

whereas what you want is:

(select category, timefield from test where category='A')
union
(select top 10 category, timefield from test where category='B' order by timefield desc)

Do note that union will remove duplicates. If you don't want this additional overhead, use union all instead.

As to why this works as a subquery, my guess is that it is a coincidence. SQL Server is not going to guarantee the results being returned when you use top without an order by -- or even that the results are consistent from one call to the next. Sometimes it might actually do what you want.

interesting behaviour of order by with union all clause

The statement

select * from (select * from dual order by 1)

has no defined order at all. Only the outermost ORDER BY takes effect in SQL (except if there is a row limit set).

If you still happen to observe order in the query results this is a coincidence that can go away at any time.

In the statement

select * from dual
union all
select * from dual order by 1

The order by is attached to the union all, not the the 2nd select. It is therefore top-level and well-defined.

Use the last form. And put the order by into a new line to make this easier to read.


How can I then sort just single select with union all?

The output order of union all is undefined without order-by clause. Certainly the two inputs are not guaranteed to be concatenated.

select *, 1 as Tag from dual
union all
select *, 2 as Tag from dual
order by Tag, 1 --simulate ordered concatenation of inputs

SQL best practice to deal with default sort order

There is no default sort order. Even if the table has a clustered index, you are not guaranteed to get the results in that order. You must use an order by clause if you want a specific order.

Default row order in SELECT query - SQL Server 2008 vs SQL 2012

You need to go back and add ORDER BY clauses to your code because without them the order is never guaranteed. You were "lucky" in the past that you always got the same order but it wasn't because SQL Server 2008 guaranteed it in anyway. It most likely had to do with your indexes or how the data was being stored on the disk.

If you moved to a new host when you upgraded the difference in hardware configuration alone could have changed the way your queries execute. Not to mention the fact that the new server would have recalculated statistics on the tables and the SQL Server 2012 query optimizer probably does things a bit differently than the one in SQL Server 2008.

It is a fallacy that you can rely on the order of a result set in SQL without explicitly stating the order you want it in. SQL results NEVER have an order you can rely on without using an ORDER BY clause. SQL is built around set theory. Query results are basically sets (or multi-sets).

Itzik Ben-Gan gives a good description of set theory in relation to SQL in his book Microsoft SQL Server 2012 T-SQL Fundamentals

Set theory, which originated with the mathematician Georg Cantor, is
one of the mathematical branches on which the relational model is
based. Cantor's definition of a set follows:

By a "set" we mean any collection M into a whole of definite, distinct
objects m (which are called the "elements" of M) of our perception or
of our thought. - Joseph W. Dauben and Georg Cantor (Princeton
University Press, 1990)

After a thorough explanation of the terms in the definition Itzik then goes on to say:

What Cantor's definition of a set leaves out is probably as important
as what it includes. Notice that the definition doesn't mention any
order among the set elements. The order in which set elements are
listed is not imporant. The formal notation for listing set elements
uses curly brackets: {a, b, c}. Because order has no relevance you can
express the same set as {b, a, c} or {b, c, a}. Jumping ahead to the
set of attributes (called columns in SQL) that make up the header of a
relation (called a table in SQL), an element is supposed to be
identified by name - not ordinal position. Similarly, consider the set
of tuples (called rows by SQL) that make up the body of the relation;
an element is identified by its key values - not by position. Many
programmers have a hard time adapting to the idea that, with respect
to querying tables, there is no order among the rows. In other words,
a query against a table can return rows in any order unless you
explicitly request that the data be sorted in a specific way, perhaps
for presentation purposes.

But regardless of the academic definition of a set even the implementation in SQL server has never guaranteed any order in the results. This MSDN blog post from 2005 by a member of the query optimizer team states that you should not rely on the order from intermediate operations at all.

The reordering rules can and will violate this assumption (and do so
when it is inconvenient to you, the developer ;). Please understand
that when we reorder operations to find a more efficient plan, we can
cause the ordering behavior to change for intermediate nodes in the
tree. If you’ve put an operation in the tree that assumes a
particular intermediate ordering, it can break.

This blog post by Conor Cunningham (Architect, SQL Server Core Engine) "No Seatbelt - Expecting Order without ORDER BY" is about SQL Server 2008. He has a table with 20k rows in it with a single index that appears to always return rows in the same order. Adding an ORDER BY to the query doesn't even change the execution plan, so it isn't like adding one in makes the query more expensive if the optimizer realizes it doesn't need it. But once he adds another 20k rows to the table suddenly the query plan changes and now it uses parallelism and the results are no longer ordered!

The hard part here is that there is no reasonable way for any external
user to know when a plan will change . The space of all plans is huge
and hurts your head to ponder. SQL Server's optimizer will change
plans, even for simple queries, if enough of the parameters change.
You may get lucky and not have a plan change, or you can just not
think about this problem and add an ORDER BY.

If you need more convincing just read these posts:

  • Without ORDER BY, there is no default sort order. - Alexander Kuznetsov
  • Order in the court! - Thomas Kyte
  • Order of a Result Set in SQL - Timothy Wiseman

Does table1 UNION ALL table2 guarantee output order table1, table2?

No order by, no order guarantee whatsoever - that's for every database.

And for standard SQL, an ORDER BY is applied to the results from all the unioned queries.

UNION ALL two SELECTs with different column types - expected behaviour?

If you want to use union all columns in every query need to have the same type.C3 must be converteted to varchar because c1 is varchar. Try below solution

create table "tab1" ("c1" varchar(max));
create table "tab2" ("c3" integer);
insert into tab1 values(N'asd'), (N'qweqwe');
insert into tab2 values(123), (345);
select
c_newname as myname
from
(
select "c1" as c_newname from "tab1"
union all
select cast("c3" as varchar(max)) from "tab2"
) as T_UNI;

I replaced "tab3" with "tab1" - I think it's typo.



Related Topics



Leave a reply



Submit