Decision between storing lookup table id's or pure data
You can use a lookup table with a VARCHAR primary key, and your main data table uses a FOREIGN KEY on its column, with cascading updates.
CREATE TABLE ColorLookup (
color VARCHAR(20) PRIMARY KEY
);
CREATE TABLE ItemsWithColors (
...other columns...,
color VARCHAR(20),
FOREIGN KEY (color) REFERENCES ColorLookup(color)
ON UPDATE CASCADE ON DELETE SET NULL
);
This solution has the following advantages:
- You can query the color names in the main data table without requiring a join to the lookup table.
- Nevertheless, color names are constrained to the set of colors in the lookup table.
- You can get a list of unique colors names (even if none are currently in use in the main data) by querying the lookup table.
- If you change a color in the lookup table, the change automatically cascades to all referencing rows in the main data table.
It's surprising to me that so many other people on this thread seem to have mistaken ideas of what "normalization" is. Using a surrogate keys (the ubiquitous "id") has nothing to do with normalization!
Re comment from @MacGruber:
Yes, the size is a factor. In InnoDB for example, every secondary index stores the primary key value of the row(s) where a given index value occurs. So the more secondary indexes you have, the greater the overhead for using a "bulky" data type for the primary key.
Also this affects foreign keys; the foreign key column must be the same data type as the primary key it references. You might have a small lookup table so you think the primary key size in a 50-row table doesn't matter. But that lookup table might be referenced by millions or billions of rows in other tables!
There's no right answer for all cases. Any answer can be correct for different cases. You just learn about the tradeoffs, and try to make an informed decision on a case by case basis.
What's best practice for normalisation of DB where a domain table has an Other option for free text?
After talking it over, we're gonna do as suggested and do a text scan and use that to populate our lookups, then going forward we'll try to discourage the use of free-text fields for lookups, storing the values in a separate table for the time being so we don't clutter our main tables.
Search for specific values in data using a lookup table
This is my version which is very similar but uses an outer apply instead of multiple joins. :-
select distinct d.id, aa.number,aa.letter from #data d
outer apply (select * from #expected_letters el where el.number=d.number and el.letter not in
(select letter from #data dt where dt.number=d.number and dt.id=d.id)
) aa
Database - Enumeration type - Extra table or just a column
There is no problem having a single table, although I would recommend a check
constraint to validate the value:
check (status in ('ACTIVE', 'INACTIVE', 'BLOCKED'))
There are many situations when you want a reference table. It provides a lot of capabilities, such as:
- Easily able to add new statuses.
- Easily able to change the names.
- The ability to have short and long names.
- The ability to share the exact same statuses across different tables.
- The ability to know when a new a status was added or changed.
- The ability to include priorities or ordering for the statuses.
However, it is not necessary to put all strings into a reference table.
Related Topics
Effect of Nolock Hint in Select Statements
How to Return Multiple Values in One Column (T-Sql)
Custom Serial/Autoincrement Per Group of Values
Hive Select Count(*) Non Null Returns Higher Value Than Select Count(*)
How to Find Gaps in Sequential Numbering in MySQL
Possible to Perform Cross-Database Queries With Postgresql
Doing a Where .. in Subquery in Doctrine 2
Add Foreign Key Relationship Between Two Databases
How to Roll Back Create Table and Alter Table Statements in Major SQL Databases
Performing SQL Queries on an Excel Table Within a Workbook With Vba Macro
Generate a Resultset of Incrementing Dates in Tsql
Why No Windowed Functions in Where Clauses
Update Multiple Rows in Same Query Using Postgresql
How to Join Multiple SQL Tables Using the Ids
Convert Varchar into Datetime in SQL Server
SQL - How to Store and Navigate Hierarchies
Why "Extra Characters After Command" Error Shown for the Sed Command Line Shown