Best Way to Test SQL Queries

Best way to test SQL queries

You wouldn't write an application with functions 200 lines long. You'd decompose those long functions into smaller functions, each with a single clearly defined responsibility.

Why write your SQL like that?

Decompose your queries, just like you decompose your functions. This makes them shorter, simpler, easier to comprehend, easier to test, easier to refactor. And it allows you to add "shims" between them, and "wrappers" around them, just as you do in procedural code.

How do you do this? By making each significant thing a query does into a view. Then you compose more complex queries out of these simpler views, just as you compose more complex functions out of more primitive functions.

And the great thing is, for most compositions of views, you'll get exactly the same performance out of your RDBMS. (For some you won't; so what? Premature optimization is the root of all evil. Code correctly first, then optimize if you need to.)

Here's an example of using several view to decompose a complicated query.

In the example, because each view adds only one transformation, each can be independently tested to find errors, and the tests are simple.

Here's the base table in the example:

create table month_value( 
eid int not null, month int, year int, value int );

This table is flawed, because it uses two columns, month and year, to represent one datum, an absolute month. Here's our specification for the new, calculated column:

We'll do that as a linear transform, such that it sorts the same as (year, month), and such that for any (year, month) tuple there is one and only value, and all values are consecutive:

create view cm_absolute_month as 
select *, year * 12 + month as absolute_month from month_value;

Now what we have to test is inherent in our spec, namely that for any tuple (year, month), there is one and only one (absolute_month), and that (absolute_month)s are consecutive. Let's write some tests.

Our test will be a SQL select query, with the following structure: a test name and a case statement catenated together. The test name is just an arbitrary string. The case statement is just case when test statements then 'passed' else 'failed' end.

The test statements will just be SQL selects (subqueries) that must be true for the test to pass.

Here's our first test:

--a select statement that catenates the test name and the case statement
select concat(
-- the test name
'For every (year, month) there is one and only one (absolute_month): ',
-- the case statement
case when
-- one or more subqueries
-- in this case, an expected value and an actual value
-- that must be equal for the test to pass
( select count(distinct year, month) from month_value)
--expected value,
= ( select count(distinct absolute_month) from cm_absolute_month)
-- actual value
-- the then and else branches of the case statement
then 'passed' else 'failed' end
-- close the concat function and terminate the query
);
-- test result.

Running that query produces this result: For every (year, month) there is one and only one (absolute_month): passed

As long as there is sufficient test data in month_value, this test works.

We can add a test for sufficient test data, too:

select concat( 'Sufficient and sufficiently varied month_value test data: ',
case when
( select count(distinct year, month) from month_value) > 10
and ( select count(distinct year) from month_value) > 3
and ... more tests
then 'passed' else 'failed' end );

Now let's test it's consecutive:

select concat( '(absolute_month)s are consecutive: ',
case when ( select count(*) from cm_absolute_month a join cm_absolute_month b
on ( (a.month + 1 = b.month and a.year = b.year)
or (a.month = 12 and b.month = 1 and a.year + 1 = b.year) )
where a.absolute_month + 1 <> b.absolute_month ) = 0
then 'passed' else 'failed' end );

Now let's put our tests, which are just queries, into a file, and run the that script against the database. Indeed, if we store our view definitions in a script (or scripts, I recommend one file per related views) to be run against the database, we can add our tests for each view to the same script, so that the act of (re-) creating our view also runs the view's tests. That way, we both get regression tests when we re-create views, and, when the view creation runs against production, the view will will also be tested in production.

How to test your query first before running it sql server

First assume you will make a mistake when updating a db so never do it unless you know how to recover, if you don't don't run the code until you do,

The most important idea is it is a dev database expect it to be messed up - so make sure you have a quick way to reload it.

The do a select first is always a good idea to see which rows are affected.

However for a quicker way back to a good state of the database which I would do anyway is

For a simple update etc

Use transactions

Do a begin transaction and then do all the updates etc and then select to check the data

The database will not be affected as far as others can see until you do a last commit which you only do when you are sure all is correct or a rollback to get to the state that was at the beginning

Generate SQL Statements for database test

You should give Random Query Generator a try, some Database-Developer Teams use it for same reasons.

Code:
https://launchpad.net/randgen

Documentation:
https://github.com/RQG/RQG-Documentation/wiki/Category%3ARandomQueryGenerator

As I can see, you may only need some parts of the tool itself (depends on if you use a MySQL-DB/JDB or not). Especially for MySQL there is a fully automated process described on how to test the queries against an already running server.

I hope this is what you need.

How to unit test an SQL query?

Just pass a SQL query, and compare the returned result to expected result. Simple. JUnit is a unit test framework, you can utilise that.

For sophisticated database unit testing, look at DBUnit.

What best practices do you use for testing database queries?

Testing stored procs will require that each person who tests has a separate instance of the db. This is a requirement. If you share environments you won't be able to rely upon the results of your test. They'll be worthless.

You will also need to ensure that you roll back the db to it's previous state after every test so as to make the results predictable and stable. Because of this need to roll back the state after every test these tests will take a lot longer to complete than standard unit tests so they'll probably be something you want to run over night.

There are a few tools out there to help you with this. DbUnit is one of them and I also believe Microsoft had a tool Visual Studio for Database Professionals that contained some support for DB testing.



Related Topics



Leave a reply



Submit