Authoritative SQL Standard Documentation

Authoritative SQL standard documentation

Quoting from one of my web sites:

We all love open source software. Wouldn’t it be great if
international standard documents such as the SQL standard would be
open too?

As a matter of fact: they are!

However, they are not free—just public. Very much like open source
software is not necessarily free. Too often, we neglect these
differences. Just because we have to pay for the standard doesn't mean
it is secret.

A download of the most relevant part of the SQL standard—part 2—is
available for USD 60 at ANSI. A CD with all parts on it can be bought
from ISO for CHF 352. Not free, but affordable.

You mentioned in some comments that you are mostly interested in part 2, so spending USD 60 might be your best option.

If you just need to know about the syntax up to 2003, there are two great free resources:

  • BNF grammar of SQL-92, SQL:1999 and SQL:2003: http://www.savage.net.au/SQL/
  • Online validator for SQL:1999: https://developer.mimer.com/services/sql-validator-99/

Finally, the complete text of “SQL-99 Complete, Really” is available at the MariaDB knowledge base. However, this book was written in 1999 when no database actually supported the described features. Keep that in mind when using this resource.

Other answers also mentioned "free" copies of the standards available on the web. Yes there are—those are mostly draft versions. I can't tell which of them are legal, so I rather not link them.

Finally a little self ad: I've just launched http://modern-sql.com/ to explain the standard in an easily accessible way to developers. Note that the actual standards text is written like laws are written :) Depending on your background, that might anyway not what you want.

What is the (de-facto) standard for implicit commit on DDL statements?

Thanks to the link provided by @GolezTrol, I was able to read a draft of the SQL:2003 standard.

In 4.33.4 SQL-statements and transaction states it says, that all SQL Schema statements are transaction initiating.

In 4.33.5 SQL-statement atomicity and statement execution contexts it says, that no atomic SQL statement (and SQL Schema statements are considered atomic) may terminate an SQL transaction.

Finally, in 4.35.6 Effects of statements in an SQL-Transaction it says:

The execution of an SQL-statement within an SQL-transaction has no effect on SQL-data or schemas other than the effect stated in the General Rules for that SQL-statement, in the General Rules for Subclause 11.8, "referential constraint definition", in the General Rules for Subclause 11.39, "trigger definition", and in the General Rules for Subclause 11.50, "SQL-invoked routine"

So it seems that an implicit commit within a DDL statement is not permitted by the standard, because it would terminate a transaction and open a new transaction within an atomic statement. It is argueable, if closing a transaction is considered an "effect" on SQL schema or data (4.35.6) - maybe this note is irrelevant for deciding if a commit is allowed implicitly or not.

And in 4.35.1 General description of SQL-transactions it says:

It is implementation-defined whether or not the execution of an SQL-data statement is permitted to occur within the same SQL-transaction as the execution of an SQL-schema statement. If it does occur, then the effect on any open cursor or deferred constraint is implementation-defined. There may be additional implementation defined restrictions, requirements, and conditions. If any such restrictions, requirements, or conditions are violated, then an implementation-defined exception condition or a completion condition warning with an implementation-defined subclass code is raised.

So what happens, if the implementation does not allow data and schema statements in one transaction? Then you are forced to use a COMMIT before and after a group of schema statements. So that would not explain, why each schema statement should be implicitly surrounded by COMMITs.

In general the pages in the standard read like they always take the existance of transacted SQL schema statements for granted.

So my conclusion is, that the standard way is NOT to have implicit COMMITs within an DDL statement.

If you don't mind and if there are no objections / protests against my interpretation of the standard, I will accept what I've found out within the next few days.

Is an ACID complaint SQL database is still ACID even when not explicitly using transactions?

The behaviour you are asking about does vary from one database to another.

With PostgreSQL and Microsoft SQL Server, if you don't use "begin transaction" ... "end transaction" then every statement is treated as a transaction. This behaviour is often described as "auto commit" and you can switch that on as an option in MySQL.

A transaction should be defined by what your application needs. The classic case is if you buy something from me then the money must leave your account and arrive in mine as one transaction. Either both actions succeed or both fail. So in this case, even with auto commit, you must put "begin transaction" ... "end transaction" around the two statements, otherwise the money might leave your account and not arrive in mine, and we'll both be unhappy.

In fact the banking transaction is more complex than that; this is a simple illustration.

Many applications require multiple SQL statements to happen as a unit, and for these you always need transactions.

The variety of behaviour means you must check the documentation. Some of the major databases (not the three mentioned above) start a new transaction on various events such as the end of the previous transaction, or a DDL statement (create, drop, alter and truncate in some cases). If you need transactions, and most real-world applications do, then you must check the documentation for the database that you are using, and do lots of multi-user testing.

There is a longer explanation here and the database documentation for the database you choose to use is the authoritative source.

SQL Best Practices for Identity value hard coding

Seems to me the strongest argument to your report writers is your second to last sentence "...those values can and are absolutely different [between environments]". That would be pretty much the gist of my response to them.

Of course there's always gray area to any question. Identity columns are essentially magic numbers. They have the benefit to the database of being...

  • Small
  • Sequential
  • Fast to seek and join on, sort by and create

...but have the downside of being of completely meaningless, and in effect, randomly assigned (sort the inserts into that table one way, you get a different identity per row than if you sorted the other way). As such, in cases where you have to look up something specific like that, it's common use also include a "business/natural/alternate" key (e.g. maybe (a completely made up example) [CategoryName] where CatgoryName is something short, unique and human readable, while. [CategoryId] is an identity, but not something intended to be sought on)

If you have a website with, say, a dropdown menu, usually the natural key gets put into the visible part of the drop down, and the surrogate/identity key gets passed around on the back end, invisible to the end user.

This gets a little trickier when you have people writing queries directly against the database. If they're owners of the data, they may know things about the larger data structure which they can take advantage of in *cough "clever" ways. If you know the keys wont change and you know what those values are, there might be a case to be made just referencing those. But again, not if they're going to be different when you query a different server.

Of course the flip side is, if you don't want them to use the identity values, you'll have to give them an alternative. And if your tables don't already include a business/natural/alternate key, you're going to have to add one wherever one doesn't already exist.

Also, there's nothing wrong with that alternate key being an integer too (maybe you already have company-wide identifiers for your offices of 1, 2, 3 etc), but the point is that it's deterministic no matter where you run your query.

SQL UPDATE read column values before setting

James R. Groff, Paul N. Weinberg: SQL The complete reference (Osborne 1999), page 209 states

start quote

If an expression in the assignment list references one of the columns of the target table,
the value used to calculate the expression is the value of that column in the current row
before any updates are applied. The same is true of column references that occur in the
WHERE clause. For example, consider this (somewhat contrived) UPDATE statement:

UPDATE OFFICES
SET QUOTA = 400000.00, SALES = QUOTA
WHERE QUOTA < 400000.00

Before the update, Bill Adams had a QUOTA value of $350,000 and a SALES value of
$367,911. After the update, his row has a SALES value of $350,000, not $400,000. The
order of the assignments in the SET clause is thus immaterial; the assignments can be
specified in any order.

end quote

The same is supported by chapter 13.9 item 6, page 393, of a draft to the ANSI-92 SQL standard (X3H2-93-004), found here.

This is the most implementaion independent and the closest I could get so far.

Other sources of X3H2-93-004 can be found e.g. here (pg 590, item 15)



Related Topics



Leave a reply



Submit