Naming of Id Columns in Database Tables

Naming of ID columns in database tables

ID is a SQL Antipattern.
See http://www.amazon.com/s/ref=nb_sb_ss_i_1_5?url=search-alias%3Dstripbooks&field-keywords=sql+antipatterns&sprefix=sql+a

If you have many tables with ID as the id you are making reporting that much more difficult. It obscures meaning and makes complex queries harder to read as well as requiring you to use aliases to differentiate on the report itself.

Further if someone is foolish enough to use a natural join in a database where they are available, you will join to the wrong records.

If you would like to use the USING syntax that some dbs allow, you cannot if you use ID.

If you use ID you can easily end up with a mistaken join if you happen to be copying the join syntax (don't tell me that no one ever does this!)and forget to change the alias in the join condition.

So you now have

select t1.field1, t2.field2, t3.field3
from table1 t1
join table2 t2 on t1.id = t2.table1id
join table3 t3 on t1.id = t3.table2id

when you meant

select t1.field1, t2.field2, t3.field3 
from table1 t1
join table2 t2 on t1.id = t2.table1id
join table3 t3 on t2.id = t3.table2id

If you use tablenameID as the id field, this kind of accidental mistake is far less likely to happen and much easier to find.

Naming primary keys id vs something_id in SQL

Whatever you do, pick one or the other and stick to that standard. There are pros and cons for each.

I prefer SomethingID but other people prefer just ID. In the system I work with there are well over a thousand tables and having the PK and the FK have the exact same names makes things easier.

Naming convention for an identity column in a database

One advantage of longer names: when you use columns in complicated query with many tables (e.g. joins) you don't have to prefix columns to know from what table they come from and also you minimize problems with column names ambiguity.

Is it more common to use table_id or id in database design

There are two main currents in terms of naming columns in tables:

Schema Namespace

This strategy is the traditional strategy that was conceived by teams documenting the "data dictionary" of a database in the 70s. The idea is that the name itself of the column tells you which table it belongs to across the whole schema or database. For example, CLIENT_NAME would represent the name of the client in the CLIENT table.

There are variations of this strategy where a limited number of letters are assigned as prefixes (specially for M:N relationship tables) because at the time column names were limited to 6 or 8 characters in many databases. For example, the date of purchase of a car by a client could take the form CLI_CAR_DATE, CLICAR_DATE, or even CLCADT.

Examples:

  • A primary key "id" column of the entity table "car" would be named CAR_ID.
  • A foreign key on a child table "document" that points to "car" would take the same form: CAR_ID. This allows the use of natural joins; however, it should be pointed out that there are compelling reasons to avoid natural joins at all cost, that are not discussed here.
  • Foreign keys on a table "transfer" that has multiple (two) relationships (seller and buyer) with "person" pollutes this strategy. They could be named: PERSON_BUYER_ID and PERSON_SELLER_ID because both cannot have the same name PERSON_ID; it doesn't allow natural joins anymore (good).

Table Namespace

In this strategy (that is newer) column names do not include the name of the entity they belong to, but only their property name. This strategy aligns more with object design, and produces shorter names (i.e. less typing). The name of the table must be indicated when mentioning a column. For example, you would need to say the column NAME on the table CLIENT.

Examples:

  • A primary key "id" column of the entity table "car" would be named ID.
  • A foreign key on a child table "document" that points to "car" would take the form: CAR_ID; this is the same solution as the previous strategy.
  • Foreign keys on a table "transfer" that has multiple (two) relationships (seller and buyer) with "person" could be named: BUYER_ID and SELLER_ID. They could follow the longer names as the previous strategy, but the goal here is typically to have shorter names so the app source code gets easier to write and to debug.

Summary

I personally like the second one, but there are teams who adhere to both strategies and there's no clear winner. My leaning towards the second one is [I think] the first one suffers from longer names (more typing), longer SQL (more errors), cryptic names (they don't play well with ORMs and app objects), and foreign keys that cannot follow the strategy well. In fact, virtually all the primary keys in my databases are named ID regardless of the specific entities.

But on the flip side, some teams value very highly the idea of knowing the table name of a column by just looking at it. And this is great for big databases (with 200-1000 relational fact tables) that can become quite complex, specially for new members of a team.

But above all, pick one and be consistent.

Database, Table and Column Naming Conventions?

I recommend checking out Microsoft's SQL Server sample databases:
https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks

The AdventureWorks sample uses a very clear and consistent naming convention that uses schema names for the organization of database objects.

  1. Singular names for tables
  2. Singular names for columns
  3. Schema name for tables prefix (E.g.: SchemeName.TableName)
  4. Pascal casing (a.k.a. upper camel case)

Naming the MYSQL id column

I usually name them as [tblname]_id. The reason is simple. Let's say I have two tables:

member
member_id
member_alias
...

post
post_id
member_id
post_text
...

Now I can join them with MySQLs USING-syntax and save a few characters for each join:

SELECT post.*, member_id, member_alias FROM post
INNER JOIN member USING (member_id)

Of course, this is all mostly subjective.

The right way to write ID in columns in SQL

Personally, I'd ProductID, ProductName etc in a Product table and for the FKs too to avoid having ID and Name columns everywhere

Just be consistent

Id or [TableName]Id as primary key / entity identifier

TableNameID for clarity

  1. Improve JOIN readability
  2. Clarity, say, when multiple FK "ID" columns (PK matches FK)
  3. ID is a reserved keyword

What does it mean when there is no Id at the end of a column name which appears to be a foreign key?

There is no required naming convention for columns in SQL Server that differentiates between a data column, a primary key column or a foreign key column.

The only constraints on column names are that they follow the rules for SQL Server identifier naming. However in a particular work environment you might well use a naming convention which does include ID at the end of the column name in order to clearly make the intention of the column obvious.

To create a self-referencing foreign key you just do the same as normal which can be as part of the create table or an alter table.

CREATE TABLE pokemon (
pokemonId INT IDENTITY(1, 1),
...
CONSTRAINT fk_pokemon_evolvesFrom FOREIGN KEY (evolvesFrom) REFERENCES pokemon (pokemonId)
);

-- OR

ALTER TABLE pokemon
ADD CONSTRAINT fk_pokemon_evolvesFrom FOREIGN KEY (evolvesFrom)
REFERENCES pokemon (pokemonId)

SQL - naming of ID columns

It's all a personal preference. I personally use Id simply because I think of each table as its own entity...then when I reference with a key it becomes CustomerId or OrderId depending on the name of the table.



Related Topics



Leave a reply



Submit