Database Design for Revisions?
- Do not put it all in one table with an IsCurrent discriminator attribute. This just causes problems down the line, requires surrogate keys and all sorts of other problems.
- Design 2 does have problems with schema changes. If you change the Employees table you have to change the EmployeeHistories table and all the related sprocs that go with it. Potentially doubles you schema change effort.
- Design 1 works well and if done properly does not cost much in terms of a performance hit. You could use an xml schema and even indexes to get over possible performance problems. Your comment about parsing the xml is valid but you could easily create a view using xquery - which you can include in queries and join to. Something like this...
CREATE VIEW EmployeeHistory
AS
, FirstName, , DepartmentId
SELECT EmployeeId, RevisionXML.value('(/employee/FirstName)[1]', 'varchar(50)') AS FirstName,
RevisionXML.value('(/employee/LastName)[1]', 'varchar(100)') AS LastName,
RevisionXML.value('(/employee/DepartmentId)[1]', 'integer') AS DepartmentId,
FROM EmployeeHistories
Database Design with Change History
One way is to have a MyTableNameHistory
for every table in your database, and make its schema identical to the schema of table MyTableName
, except that the Primary Key of the History table has one additional column named effectiveUtc
as a DateTime. For example, if you have a table named Employee
,
Create Table Employee
{
employeeId integer Primary Key Not Null,
firstName varChar(20) null,
lastName varChar(30) Not null,
HireDate smallDateTime null,
DepartmentId integer null
}
Then the History table would be
Create Table EmployeeHistory
{
employeeId integer Not Null,
effectiveUtc DateTime Not Null,
firstName varChar(20) null,
lastName varChar(30) Not null,
HireDate smallDateTime null,
DepartmentId integer null,
Primary Key (employeeId , effectiveUtc)
}
Then, you can put a trigger on Employee table, so that every time you insert, update, or delete anything in the Employee table, a new record is inserted into the EmployeeHistory table with the exact same values for all the regular fields, and current UTC datetime in the effectiveUtc column.
Then to find the values at any point in the past, you just select the record from the history table whose effectiveUtc value is the highest value prior to the asOf datetime you want the value as of.
Select * from EmployeeHistory h
Where EmployeeId = @EmployeeId
And effectiveUtc =
(Select Max(effectiveUtc)
From EmployeeHistory
Where EmployeeId = h.EmployeeId
And effcetiveUtc < @AsOfUtcDate)
Database design for create/revise/delete request form
It sounds to me like a revision can change anything about a course, so if you had separate CreateRequest and ReviseRequest tables, pretty much every column would have to be duplicated. I don't see any good reason to do that.
Deletion might arguably be more debatable as most of the data is irrelevant. Still, I don't see anything gained by making a separate table for it. So what if irrelevant fields are null? So it has a bunch of nulls.
Actually I wouldn't create separate columns for "revision reason" and "delete reason". I'd just make one column and call it "request reason" or some such.
You should have some field that tells you which type of request it is. (Or is that what FormTypeID is?)
How to version control a record in a database
Let's say you have a FOO
table that admins and users can update. Most of the time you can write queries against the FOO table. Happy days.
Then, I would create a FOO_HISTORY
table. This has all the columns of the FOO
table. The primary key is the same as FOO plus a RevisionNumber column. There is a foreign key from FOO_HISTORY
to FOO
. You might also add columns related to the revision such as the UserId and RevisionDate. Populate the RevisionNumbers in an ever-increasing fashion across all the *_HISTORY
tables (i.e. from an Oracle sequence or equivalent). Do not rely on there only being one change in a second (i.e. do not put RevisionDate
into the primary key).
Now, every time you update FOO
, just before you do the update you insert the old values into FOO_HISTORY
. You do this at some fundamental level in your design so that programmers can't accidentally miss this step.
If you want to delete a row from FOO
you have some choices. Either cascade and delete all the history, or perform a logical delete by flagging FOO
as deleted.
This solution is good when you are largely interested in the current values and only occasionally in the history. If you always need the history then you can put effective start and end dates and keep all the records in FOO
itself. Every query then needs to check those dates.
Database Design: how to track history?
I prefer to have additional historical table for each versioned table. Same structure as main table with time_from
and time_to
additional fields.
Transparently filled with triggers. time_to
of latest revision set to far far future.
State for specified moment can be retrieved with query like this:
SELECT * FROM user_history
WHERE time_from >= '2012-02-01' AND time_to <= '2012-02-01'
As for me, storing history within main table is not generally a good idea, as it requires complicated conditions when retrieving or joining current data.
database design
Option 1:
Create a Forms table, that is the central source of Form IDs. When inserting into Tasks/Briefs/etc (I assume there may be many more), you first insert into the Forms table, then use that ID as a PK/FK in Tasks or Briefs.
Then you have FormID in Revisions just reference the FormID in Forms.
You'd probably want to record the form type in the Forms table, and possibly enforce that if you've inserted say "1,Task" into Forms, that the only table you can insert 1 into is the Tasks table. There are ways to do that.
You might also want to hide some of these facts behind views/triggers (so you just insert into Tasks, and behind the scenes it does the insert into Forms, etc).
Option 2:
Create one column per table in the Revisions table. Add a constraint such that exactly one of these columns is not null. This does allow you to use more specific foreign key constraints (unlike option 1, where just because there's a Form ("1,Task") in Forms, you couldn't guarantee there would be a row in the Tasks table). But if you're adding new form types, you have to add more columns to Revisions.
There are some other options, but these are the two that spring to mind.
Database design: Keeping track of tag changes between revisions
Do not overcomplicate your schema and do not optimize prematurely. Pair-wise comparison of revision tags is a very cheap operation (provided you have all data in memory) and it can be done while prepararing view model.
Related Topics
T-SQL Split String Based on Delimiter
How to Assign an Exec Result to a SQL Variable
SQL Rownum How to Return Rows Between a Specific Range
SQL Server:Transpose Rows to Columns
SQL Rank() Versus Row_Number()
SQL Server - Transactions Roll Back on Error
Key Value Pairs in Relational Database
SQL Update Set One Column to Be Equal to a Value in a Related Table Referenced by a Different Column
Delete Column from Sqlite Table
What Does Include() Do in Linq
MySQL "Between" Clause Not Inclusive
How to Capitalize the First Letter of Each Word in a String in SQL Server
Delete All Rows in a Table Based on Another Table
When to Use Varchar and Date/Datetime
Using Stored Procedure in Classical Asp .. Execute and Get Results