What to do when I want to use database constraints but only mark as deleted instead of deleting?
You could add the id value to the end of the name when a record is deleted, so when someone deletes id 3 the name becomes Thingy3_3 and then when they delete id 100 the name becomes Thingy3_100. This would allow you to create a unique composite index on the name and deleted fields but you then have to filter the name column whenever you display it and remove the id from the end of the name.
Perhaps a better solution would be to replace your deleted column with a deleted_at column of type DATETIME. You could then maintain a unique index on name and deleted at, with a non-deleted record having a null value in the deleted_at field. This would prevent the creation of multiple names in an active state but would allow you to delete the same name multiple times.
You obviously need to do a test when undeleting a record to ensure that there is no row with the same name and a null deleted_at field before allowing the un-delete.
You could actually implement all of this logic within the database by using an INSTEAD-OF trigger for the delete. This trigger would not delete records but would instead update the deleted_at column when you deleted a record.
The following example code demonstrates this
CREATE TABLE swtest (
id INT IDENTITY,
name NVARCHAR(20),
deleted_at DATETIME
)
GO
CREATE TRIGGER tr_swtest_delete ON swtest
INSTEAD OF DELETE
AS
BEGIN
UPDATE swtest SET deleted_at = getDate()
WHERE id IN (SELECT deleted.id FROM deleted)
AND deleted_at IS NULL -- Required to prevent duplicates when deleting already deleted records
END
GO
CREATE UNIQUE INDEX ix_swtest1 ON swtest(name, deleted_at)
INSERT INTO swtest (name) VALUES ('Thingy1')
INSERT INTO swtest (name) VALUES ('Thingy2')
DELETE FROM swtest WHERE id = SCOPE_IDENTITY()
INSERT INTO swtest (name) VALUES ('Thingy2')
DELETE FROM swtest WHERE id = SCOPE_IDENTITY()
INSERT INTO swtest (name) VALUES ('Thingy2')
SELECT * FROM swtest
DROP TABLE swtest
The select from this query returns the following
id name deleted_at
1 Thingy1 NULL
2 Thingy2 2009-04-21 08:55:38.180
3 Thingy2 2009-04-21 08:55:38.307
4 Thingy2 NULL
So within your code you can delete records using a normal delete and let the trigger take care of the details. The only possible issue (That I could see) was that deleting already deleted records could result in duplicate rows, hence the condition in the trigger to not update the deleted_at field on an already deleted row.
Relational Database: DELETE versus Mark for Deletion
There are many reasons to not use delete
. First, maintaining history can be very important. I wouldn't use "just" a delete flag, but instead have dates of validity.
Second, in an operational system, delete
can be an expensive operation. The row needs to be deleted from the table, from associated indexes, and then there might be cascading deletes and triggers.
Third, delete
can prevent other operations from working well, because tables and rows and indexes get locked. This can slow down an operational system, particularly during peak periods.
Fourth, delete
can be tricky to maintain relational integrity -- especially if those cascading deletes are not defined.
Fifth, storage is cheap. Processing power is cheap. So, for many databases, deleting records to recover space is simply unnecessary.
This doesn't mean that you should always avoid deleting records. But there are very valid reasons for not rushing to remove data.
Database constraints to ignore soft-deleted entries when evaluating uniqueness
You can't actually tell builder
to ignore values from deleted entries, since the builder just adds native mysql/postgres constraint to your table.
You would have to do this manually when adding a new user, e.g. query the full table, including deleted entries, and go from there.
How to add constraint for deleting in sql
Since you want to perform soft deletes, you can accomplish what you want by adding some additional helper columns and foreign keys:
create table Books (
Id uniqueidentifier primary key,
Title varchar(255) not null,
Author varchar(255) not null,
Deleted datetime null,
_DelXRef as CASE WHEN Deleted is null then 0 else 1 END persisted,
constraint UQ_Books_DelXRef UNIQUE (Id,_DelXRef)
)
create table Categories (
Id uniqueidentifier primary key,
Name varchar(255) not null,
Deleted datetime null,
_DelXRef as CASE WHEN Deleted is null then 0 else 1 END persisted,
constraint UQ_Categories_DelXRef UNIQUE (Id,_DelXRef)
)
create table BookCategories (
BookId uniqueidentifier not null,
CategoryId uniqueidentifier not null,
_DelXRef as 0 persisted,
constraint FK_BookCategories_Books foreign key (BookID) references Books(Id),
constraint FK_BookCategories_Books_DelXRef foreign key (BookID,_DelXRef) references Books(Id,_DelXRef),
constraint FK_BookCategories_Categories foreign key (CategoryId) references Categories(Id),
constraint FK_BookCategories_Categories_DelXRef foreign key (CategoryId,_DelXRef) references Categories(Id,_DelXRef)
)
Hopefully, you can see how the foreign keys ensure that the _DelXRef
columns in the referenced tables have to remain 0
at all times, and so it's not possible to set Deleted
to any non-NULL value whilst the row is being referenced from the BookCategories
table.
(At this point, the "original" foreign keys, FK_BookCategories_Books
and FK_BookCategories_Categories
appear to be redundant. I prefer to keep them in the model to document the real FK relationships. I'm also using my own convention of prefixing objects with _
where it's not intended that they be used to the users of the database - they exist simply to allow DRI to be enforced)
SQL Server DB - Deleting Records, or setting an IsDeleted flag?
As a rule of thumb I never delete any data. The type of business I am in there are always questions suchas 'Of the customers that cancelled how many of them had a widget of size 4' If I had deleted the customer how could I get it. Or more likely if had deleted a widget of size 4 from the widget table this would cause a problem with referential integrity. An 'Active' bit flag seems to work for me and with indexing there is no big performance hit.
Foreign key constraints: When to use ON UPDATE and ON DELETE
Do not hesitate to put constraints on the database. You'll be sure to have a consistent database, and that's one of the good reasons to use a database. Especially if you have several applications requesting it (or just one application but with a direct mode and a batch mode using different sources).
With MySQL you do not have advanced constraints like you would have in postgreSQL but at least the foreign key constraints are quite advanced.
We'll take an example, a company table with a user table containing people from theses company
CREATE TABLE COMPANY (
company_id INT NOT NULL,
company_name VARCHAR(50),
PRIMARY KEY (company_id)
) ENGINE=INNODB;
CREATE TABLE USER (
user_id INT,
user_name VARCHAR(50),
company_id INT,
INDEX company_id_idx (company_id),
FOREIGN KEY (company_id) REFERENCES COMPANY (company_id) ON...
) ENGINE=INNODB;
Let's look at the ON UPDATE clause:
- ON UPDATE RESTRICT : the default : if you try to update a company_id in table COMPANY the engine will reject the operation if one USER at least links on this company.
- ON UPDATE NO ACTION : same as RESTRICT.
- ON UPDATE CASCADE : the best one usually : if you update a company_id in a row of table COMPANY the engine will update it accordingly on all USER rows referencing this COMPANY (but no triggers activated on USER table, warning). The engine will track the changes for you, it's good.
- ON UPDATE SET NULL : if you update a company_id in a row of table COMPANY the engine will set related USERs company_id to NULL (should be available in USER company_id field). I cannot see any interesting thing to do with that on an update, but I may be wrong.
And now on the ON DELETE side:
- ON DELETE RESTRICT : the default : if you try to delete a company_id Id in table COMPANY the engine will reject the operation if one USER at least links on this company, can save your life.
- ON DELETE NO ACTION : same as RESTRICT
- ON DELETE CASCADE : dangerous : if you delete a company row in table COMPANY the engine will delete as well the related USERs. This is dangerous but can be used to make automatic cleanups on secondary tables (so it can be something you want, but quite certainly not for a COMPANY<->USER example)
- ON DELETE SET NULL : handful : if you delete a COMPANY row the related USERs will automatically have the relationship to NULL. If Null is your value for users with no company this can be a good behavior, for example maybe you need to keep the users in your application, as authors of some content, but removing the company is not a problem for you.
usually my default is: ON DELETE RESTRICT ON UPDATE CASCADE. with some ON DELETE CASCADE
for track tables (logs--not all logs--, things like that) and ON DELETE SET NULL
when the master table is a 'simple attribute' for the table containing the foreign key, like a JOB table for the USER table.
Edit
It's been a long time since I wrote that. Now I think I should add one important warning. MySQL has one big documented limitation with cascades. Cascades are not firing triggers. So if you were over confident enough in that engine to use triggers you should avoid cascades constraints.
- http://dev.mysql.com/doc/refman/5.6/en/triggers.html
MySQL triggers activate only for changes made to tables by SQL statements. They do not activate for changes in views, nor by changes to tables made by APIs that do not transmit SQL statements to the MySQL Server
- http://dev.mysql.com/doc/refman/5.6/en/stored-program-restrictions.html#stored-routines-trigger-restrictions
==> See below the last edit, things are moving on this domain
Triggers are not activated by foreign key actions.
And I do not think this will get fixed one day. Foreign key constraints are managed by the InnoDb storage and Triggers are managed by the MySQL SQL engine. Both are separated. Innodb is the only storage with constraint management, maybe they'll add triggers directly in the storage engine one day, maybe not.
But I have my own opinion on which element you should choose between the poor trigger implementation and the very useful foreign keys constraints support. And once you'll get used to database consistency you'll love PostgreSQL.
12/2017-Updating this Edit about MySQL:
as stated by @IstiaqueAhmed in the comments, the situation has changed on this subject. So follow the link and check the real up-to-date situation (which may change again in the future).
Related Topics
Psql: Fatal: Too Many Connections for Role
How to Select and Order by Columns Not in Groupy by SQL Statement - Oracle
Using Nvl for Multiple Columns - Oracle Sql
Sql "If Exists..." Dynamic Query
% in The Beginning of Like Clause
Ordering Distinct Column Values by (First Value Of) Other Column in Aggregate Function
Update Multiple Rows Using Case When - Oracle
Inserting Guid into SQL Server
Why Can't I Reorder My SQL Server Columns
Sql Design Approach for Searching a Table with an Unlimited Number of Bit Fields
Why Can't I Use Select ... for Update with Aggregate Functions
How to Add Months to a Current_Timestamp in Sql
Database Design and The Use of Non-Numeric Primary Keys
Sql Selecting "Window" Around Particular Row
Spring Data JPA - Query with The Date Minus 2 Days Not Working