DB design to use sub-type or not?
Maybe a bit different approach -- supertype/subtype is usually used when you have very specific columns for each subtype, like in Person supertype with Patient and Doctor subtypes. Person holds all data common to people and Patient and Doctor hold very specific columns for each one. In this example your book_notes
and article_notes
are not really that different.
I would rather consider having a supertype Publication with Book and Article as subtypes. Then you can have just one Note table with FK to Publication. Considering that a PK number in Publication is the same number as the [PK,FK] of Book (Article) you can do joins with notes on Publication, Book or Article. This way you can simply add another publication, like Magazine by adding a new sub-classed table and not changing anything regarding Note.
For example:
TABLE Publication (
ID (PK)
, Title
, -- more columns common to any publication
)
TABLE Book (
ID (PK) = FK to Publication
, ISBN
, -- more columns specific to books only
)
TABLE Article (
ID (PK) = FK to Publication
, -- more columns specific to articles only)
TABLE Note (
ID (PK)
, PublicationID = FK to Publication
, NoteText
)
Primary key for Book
and Article
tables also serves as a foreign key to the Publication
.
Now if we add another publication, Magazine:
TABLE Magazine (
ID (PK) = FK to Publication
, -- more columns specific to magazines only
)
We do not have to modify Note
in any way -- and we have added columns specific to magazines only.
Database design - single subtype in a separate table vs common table
The "clean" implementation of subtypes is, in my perception, to create a separate table for each subtype and a common one for the supertype.
This avoids complex integrity conditions and reduces the number of null values.
To illustrate how complex integrity conditions arise, just imagine you have a supertype and a subtype with 10 additional mandatory properties ("columns") and several optional properties.
Now if a single optional property is non-null, the 10 additional mandatory properties must be non-null as well.
This gets worse if you imagine you have 12 subtypes instead of just one.
On the other hand, if you store everything in a single table, you don't have to perform joins. This is a performance advantage that can add up if you often need the additional columns.
Naturally this is only partially true. If you have many subtypes, the rows will be long. This reduces the effectiveness of your data cache.
If your application doesn't need the additional information very often, it is probably better to keep a separate table for the additional columns. If it needs all the information all the time, you'll probably be better of with a single table containing everything.
In short: there is no general answer to your question.
The best approach will be to make a guess based on your application and my considerations. You then implement this and test if the performance of your implementation meets your requirements. If so, you have a valid implementation. If not, try the other strategy.
Should subtype/supertype used in databses?
it depends on the details. There are arguments for each approach. Basically, the more data that is shared between the tables, the more sense it makes to 'normalize' the data and use 'supertype/subtype' as you put it. Note that if you take this approach, your sql can get pretty complicated and you will have to join across the tables.
Another option is to have one table and use a simple column like 'comment_type' to differentiate between if it is a blog, article, or profile. The sql for that approach would be real simple too and its just a where comment_type = 'whatever'
. Be sure to index the 'comment_type' column. This approach makes less sense if the columns in your tables are drastically different.
Database Design: Join tables on supertype/subtype
It make sense to join to the supertype in some databases, it make sense to join to the individual subtypes in some databases, and it make sense to join different data to both supertype and to subtypes in some databases.
Is it really the case that a table for (ProductA, User) will have different columns than (ProductB, User)? If so, then there's your answer. Different attributes, different things, different tables. Separate m:n tables, and reference the subtype.
(Later)
I answered another SO question and included code that did exactly this.
Related Topics
Dynamic SQL to Generate Column Names
Count Based on Condition in SQL Server
How to Get the Last Row of an Oracle Table
Left Join Turns into Inner Join
Best Way to Count Rows by Arbitrary Time Intervals
How to Install Localdb Separately
Oracle Joins - Comparison Between Conventional Syntax VS Ansi Syntax
Postgresql Sequence Based on Another Column
Combine Two Columns and Add into One New Column
Rbar VS. Set Based Programming for SQL
Concat Field Value to String in SQL Server
"Operator Does Not Exist: Integer =" When Using Postgres
Dynamic Oracle Pivot_In_Clause
Get the Distinct Sum of a Joined Table Column
How to Count Occurrences of a Column Value Efficiently in SQL