SQL Primary Key - Is It Necessary

Is a Primary Key necessary in SQL Server?

Necessary? No. Used behind the scenes? Well, it's saved to disk and kept in the row cache, etc. Removing will slightly increase your performance (use a watch with millisecond precision to notice).

But ... the next time someone needs to create references to this table, they will curse you. If they are brave, they will add a PK (and wait for a long time for the DB to create the column). If they are not brave or dumb, they will start creating references using the business key (i.e. the data columns) which will cause a maintenance nightmare.

Conclusion: Since the cost of having a PK (even if it's not used ATM) is so small, let it be.

SQL Primary Key - is it necessary?

Always aim to have a primary key.

If you are unsure, have a primary key.

Even if you are 99.99% sure you will not need it, have one. Requirements change as I have learned through experience over many years.

The only examples I can really think of are many-to-many tables with just two foreign_keys and mega-huge (hundreds of millions of rows) tables where every byte counts. But even then a separate, unique, no-business value id key is still strongly recommended.

There's some more great info on this here:

http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx

and here:

http://www.techrepublic.com/article/the-great-primary-key-debate/1045050

here:

http://databases.aspfaq.com/database/what-should-i-choose-for-my-primary-key.html

and here:

Should I use composite primary keys or not?

In your example, I would definitely have one.

The decision to 'not' have one should be based on a very clear need and understanding and actual or predicted (e.g. volume) issues with having one.

One great example of this need comes up when debugging and troubleshooting. Just like having create and update columns in each table (another favorite of mine), this info may not initially be used by/for the front end but boy can it be helpful in tracing and resolving issues. (btw update stamps are often now standard in frameworks like Ruby On Rails which also works well with the convention of every table having an id field!)

Is a primary key necessary?

This is a subjective question, so I hope you don't mind me answering with some opinion :)

In the vast majority of tables I've made – I'm talking 95%+ – I've added a primary key, and been glad I did. This is either the most critical unique field in my table (think "social security number") or, more often than not, just an auto-incrementing number that allows me to quickly and easily refer to a field when querying.

This latter use is the most common, and it even has its own name: a "surrogate" or "synthetic" key. This is a value auto-generated by the database and not derived from your application data. If you want to add relations between your tables, this surrogate key is immediately helpful as a foreign key. As someone else answered, these keys are so common that MySQL likes to add one even if you don't, so I'd suggest that means the consensus is very heavily biased towards adding primary keys.

One other thing I like about primary keys is that they help convey your intent to others reading your table schemata and also to your DMBS: "this bit is how I intend to identify my rows uniquely, don't let me try to break that rule!"

To answer your question specifically: no, a primary key is not necessary. But realistically if you intend to store data in the table for any period of time beyond a few minutes, I would very strongly recommend you add one.

Why we need a primary key?

I suppose a primary key can have a not null value only if the column
is declared as not null.But again this is not a feature of primary
key.

Primary key can't have a null values. By definition of primary key, it is UNIQUE and NOT NULL.

My another question is that why do we have a concept of primary key
because I find only one difference between primary key and unique key
is that "Primary key can be declared only on one column whereas unique
key can be declared on multiple columns"

This is completely wrong. You can create primary key on multiple columns also, the difference between Primary Key and Unique Key is Primary Key is not null and Unique key can have null values.

The main purpose of primary key is to identify the uniqueness of a row, where as unique key is to prevent the duplicates, following are the main difference between primary key and unique key.

Primary Key :

  1. There can only be one primary key for a table.
  2. The primary key consists of one or more columns.
  3. The primary key enforces the entity integrity of the table.
  4. All columns defined must be defined as NOT NULL.
  5. The primary key uniquely identifies a row.
  6. Primary keys result in CLUSTERED unique indexes by default.

Unique Key :

  1. There can be multiple unique keys defined on a table.

  2. Unique Keys result in NONCLUSTERED Unique Indexes by default.

  3. One or more columns make up a unique key.

  4. Column may be NULL, but on one NULL per column is allowed.

  5. A unique constraint can be referenced by a Foreign Key Constraint.

I suggest you read this primary key and unique key

Why is it a bad idea to have a table without a primary key?

This response is mainly opinion/experience-based, so I'll list a few reasons that come to mind. Note that this is not exhaustive.

Here're some reasons why you should use primary keys (PKs):

  1. They allow you to have a way to uniquely identify a given row in a table to ensure that there're no duplicates.
  2. The RDBMS enforces this constraint for you, so you don't have to write additional code to check for duplicates before inserting, avoiding a full table scan, which implies better performance here.
  3. PKs allow you to create foreign keys (FKs) to create relations between tables in a way that the RDBMS is "aware" of them. Without PKs/FKs, the relationship only exists inside the programmer's mind, and the referenced table might have a row with its "PK" deleted, and the other table with the "FK" still thinks the "PK" exists. This is bad, which leads to the next point.
  4. It allows the RDBMS to enforce integrity constraints. Is TableA.id referenced by TableB.table_a_id? If TableB.table_a_id = 5 then, you're guaranteed to have a row with id = 5 in TableA. Data integrity and consistency is maintained, and that is good.
  5. It allows the RDBMS to perform faster searches b/c PK fields are indexed, which means that a table doesn't need to have all of its rows checked when searching for something (e.g. a binary search on a tree structure).

In my opinion, not having a PK might be legal (i.e. the RDBMS will let you), but it's not moral (i.e. you shouldn't do it). I think you'd need to have extraordinarily good/powerful reasons to argue for not using a PK in your DB tables (and I'd still find them debatable), but based on your current level of experience (i.e. you say you're "new to data modeling"), I'd say it's not yet enough to attempt justifying a lack of PKs.

There're more reasons, but I hope this gives you enough to work through it.

As far as your M:M relations go, you need to create a new table, called an associative table, and a composite PK in it, that PK being a combination of the 2 PKs of the other 2 tables.

In other words, if there's a M:M relation between tables A and B, then we create a table C that has a 1:M relation to with both tables A and B. "Graphically", it'd look similar to:

+---+ 1  M +---+ M  1 +---+
| A |------| C |------| B |
+---+ +---+ +---+

With the C table PK somewhat like this:

+-----+
| C |
+-----+
| id | <-- C.id = A.id + B.id (i.e. combined/concatenated, not addition!)
+-----+

or like this:

+-------+
| C |
+-------+
| a_id | <--|
+-------+ +-- composite PK columns instead
| b_id | <--| of concatenation (recommended)
+-------+

Should each and every table have a primary key?

Short answer: yes.

Long answer:

  • You need your table to be joinable on something
  • If you want your table to be clustered, you need some kind of a primary key.
  • If your table design does not need a primary key, rethink your design: most probably, you are missing something. Why keep identical records?

In MySQL, the InnoDB storage engine always creates a primary key if you didn't specify it explicitly, thus making an extra column you don't have access to.

Note that a primary key can be composite.

If you have a many-to-many link table, you create the primary key on all fields involved in the link. Thus you ensure that you don't have two or more records describing one link.

Besides the logical consistency issues, most RDBMS engines will benefit from including these fields in a unique index.

And since any primary key involves creating a unique index, you should declare it and get both logical consistency and performance.

See this article in my blog for why you should always create a unique index on unique data:

  • Making an index UNIQUE

P.S. There are some very, very special cases where you don't need a primary key.

Mostly they include log tables which don't have any indexes for performance reasons.

In SQL, why do we need a primary key if we can use NOT NULL and UNIQUE constraints in place of a primary key?

The definition of a primary key is:

  • A primary key is unique.
  • A primary key is not null.
  • Table has only one primary key.

You are asking about the third condition. Well, that is the definition. The "primary key" is a single set of keys that have been explicitly chosen to uniquely identify each row in the table. The word "primary" implies that there is only one per table. Other keys or combinations of keys that meet the first two conditions are called candidate primary keys.

Although not strictly enforced, primary keys are the best method for referencing individual rows. They should be used for foreign key constraints, for instance (and any database that I come into contact with does enforce primary keys for foreign key constraints). Having multiple different keys refer to a single table confuses the data model. Think about Entity-Relationship modeling. The links should be primary keys.

To give a flavor of the use of primary keys, some databases (such as MySQL using the InnoDB storage engine) by default cluster tables based on the primary key. A table can only be clustered once, hence the use of a single key.



Related Topics



Leave a reply



Submit