Why Isn't SQL Ansi-92 Standard Better Adopted Over Ansi-89

Why isn't SQL ANSI-92 standard better adopted over ANSI-89?

According to "SQL Performance Tuning" by Peter Gulutzan and Trudy Pelzer, of the six or eight RDBMS brands they tested, there was no difference in optimization or performance of SQL-89 versus SQL-92 style joins. One can assume that most RDBMS engines transform the syntax into an internal representation before optimizing or executing the query, so the human-readable syntax makes no difference.

I also try to evangelize the SQL-92 syntax. Sixteen years after it was approved, it's about time people start using it! And all brands of SQL database now support it, so there's no reason to continue to use the nonstandard (+) Oracle syntax or *= Microsoft/Sybase syntax.

As for why it's so hard to break the developer community of the SQL-89 habit, I can only assume that there's a large "base of the pyramid" of programmers who code by copy & paste, using ancient examples from books, magazine articles, or another code base, and these people don't learn new syntax abstractly. Some people pattern-match, and some people learn by rote.

I am gradually seeing people using SQL-92 syntax more frequently than I used to, though. I've been answering SQL questions online since 1994.

Is joining to the result of a select (rather than an actual table) valid ANSI standard SQL?

Yes, this is standard ANSI SQL, it's known as a derived table and is a building block of most SQL dialects.

In SQL it's perfectly valid to treat the result af a query as a table in its own right and reference it (with an appropriate alias) in any "parent" query.

It's supported by almost all RDBMS platforms.

I say almost as a caveat but I can't actually think of any currently supported platform that would not allow a derived table.

Why does no database fully support ANSI or ISO SQL standards?

In the software industry you have some standards that are really standards, i.e., products that don't comply with them just don't work. File specifications fall into that category. But then you also have "standards" that are more like guidelines: they may defined as standards with point-by-point definitions, but routinely implemented only partially or with significant differences. Web development is full of such "standards", like HTML, CSS and "ECMAScript" where different vendors (i.e. web browsers) implement the standards differently.

The variation causes headaches, but the standardization still provides benefits. Imagine if there were no HTML standard at all and each browser used its own markup language. Likewise, imagine if there were no SQL standard and each database vendor used its own completely proprietary querying language. There would be much more vendor lock-in, and developers would have a much harder time working with more than one product.

So, no, ANSI SQL doesn't serve the same purpose as ANSI standards do in other industries. But it does serve a useful purpose nonetheless.

Changing SQL Server query to pure ANSI SQL query

You might try to transfer the CTE and all applies to inlined sub-selects:

declare @rsBuildDetails table(dt datetime, build varchar(255), val varchar(255));

insert into @rsBuildDetails (dt, build, val) values
('20100101', '1', 'pass')
,('20100102', '2', 'fail')
,('20100103', '3', 'pass')
,('20100104', '4', 'fail')
,('20100105', '5', 'fail')
,('20100106', '6', 'fail')
,('20100107', '7', 'pass')
,('20100108', '8', 'pass')
,('20100109', '9', 'pass')
,('20100110', '10', 'fail');

select *
from
(
select distinct
(
select top 1 pre.Dt
from @rsBuildDetails as pre
where pre.dt<passed.dt
and pre.val='fail'
order by pre.dt desc
) as FailedDt
,(
select top 1 post.Dt
from @rsBuildDetails as post
where post.dt>passed.dt
and post.val='fail'
order by post.dt asc
) AS SecondFailedDt
from
(
select *
from @rsBuildDetails
where val='pass'
) AS passed
) AS tbl
where tbl.FailedDt IS NOT NULL
AND tbl.SecondFailedDt IS NOT NULL

What is the difference between these two SQL?

They are the same as far as the query engine is concerned.

The first type, commonly called a comma join, is an implicit inner join in most (all?) RDBMSs. The syntax is from ANSI SQL 89 and earlier.

The syntax of the second join, called an explicit inner join, was introduced in ANSI SQL 92. It is considered improved syntax because even with complex queries containing dozens of tables it is easy to see the difference between join conditions and filters in the where clause.

The ANSI 92 syntax also allows the query engine to potentially optimize better. If you use a something in the join condition, then it happens before (or as) the join is done. If the field is indexed, you can get some benefit since the query engine will know to not bother with certain rows in the table, whereas if you put it in the WHERE clause, the query engine will need to join the tables completely and then filter out results. Usually the RDBMS will treat them identically -- probably 999 cases out of 1000 -- but not always.

See also:

Why isn't SQL ANSI-92 standard better adopted over ANSI-89?

ANSI Support of Select Count SQLStatements

The SQL standard that almost all DBMS's use is the ANSI 92 standard, which can be found at http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt. Page 124 has the information that you are looking for. Most DBMSs offer something in addition to the ANSI 92 standard, but this is kind of the lowest common denominator of all of them.

SQL inner join, which style is better?

There is no difference in terms of performance. The where clause is in fact the same as INNER JOIN when it comes to relational algabra.

Read here for a brief explaination



Related Topics



Leave a reply



Submit