Db Design to Use Sub-Type or Not

DB design to use sub-type or not?

Maybe a bit different approach -- supertype/subtype is usually used when you have very specific columns for each subtype, like in Person supertype with Patient and Doctor subtypes. Person holds all data common to people and Patient and Doctor hold very specific columns for each one. In this example your book_notes and article_notes are not really that different.

I would rather consider having a supertype Publication with Book and Article as subtypes. Then you can have just one Note table with FK to Publication. Considering that a PK number in Publication is the same number as the [PK,FK] of Book (Article) you can do joins with notes on Publication, Book or Article. This way you can simply add another publication, like Magazine by adding a new sub-classed table and not changing anything regarding Note.

For example:

TABLE Publication (
ID (PK)
, Title
, -- more columns common to any publication
)

TABLE Book (
ID (PK) = FK to Publication
, ISBN
, -- more columns specific to books only
)

TABLE Article (
ID (PK) = FK to Publication
, -- more columns specific to articles only)

TABLE Note (
ID (PK)
, PublicationID = FK to Publication
, NoteText
)

Primary key for Book and Article tables also serves as a foreign key to the Publication.

Now if we add another publication, Magazine:

TABLE Magazine (
ID (PK) = FK to Publication
, -- more columns specific to magazines only
)

We do not have to modify Note in any way -- and we have added columns specific to magazines only.


pub_model_01

Database design - single subtype in a separate table vs common table

The "clean" implementation of subtypes is, in my perception, to create a separate table for each subtype and a common one for the supertype.
This avoids complex integrity conditions and reduces the number of null values.

To illustrate how complex integrity conditions arise, just imagine you have a supertype and a subtype with 10 additional mandatory properties ("columns") and several optional properties.
Now if a single optional property is non-null, the 10 additional mandatory properties must be non-null as well.
This gets worse if you imagine you have 12 subtypes instead of just one.

On the other hand, if you store everything in a single table, you don't have to perform joins. This is a performance advantage that can add up if you often need the additional columns.
Naturally this is only partially true. If you have many subtypes, the rows will be long. This reduces the effectiveness of your data cache.

If your application doesn't need the additional information very often, it is probably better to keep a separate table for the additional columns. If it needs all the information all the time, you'll probably be better of with a single table containing everything.

In short: there is no general answer to your question.
The best approach will be to make a guess based on your application and my considerations. You then implement this and test if the performance of your implementation meets your requirements. If so, you have a valid implementation. If not, try the other strategy.

Should subtype/supertype used in databses?

it depends on the details. There are arguments for each approach. Basically, the more data that is shared between the tables, the more sense it makes to 'normalize' the data and use 'supertype/subtype' as you put it. Note that if you take this approach, your sql can get pretty complicated and you will have to join across the tables.

Another option is to have one table and use a simple column like 'comment_type' to differentiate between if it is a blog, article, or profile. The sql for that approach would be real simple too and its just a where comment_type = 'whatever'. Be sure to index the 'comment_type' column. This approach makes less sense if the columns in your tables are drastically different.

Database Design: Join tables on supertype/subtype

It make sense to join to the supertype in some databases, it make sense to join to the individual subtypes in some databases, and it make sense to join different data to both supertype and to subtypes in some databases.

Is it really the case that a table for (ProductA, User) will have different columns than (ProductB, User)? If so, then there's your answer. Different attributes, different things, different tables. Separate m:n tables, and reference the subtype.

(Later)

I answered another SO question and included code that did exactly this.



Related Topics



Leave a reply



Submit