Is a Primary Key Necessary in SQL Server

Is a Primary Key necessary in SQL Server?

Necessary? No. Used behind the scenes? Well, it's saved to disk and kept in the row cache, etc. Removing will slightly increase your performance (use a watch with millisecond precision to notice).

But ... the next time someone needs to create references to this table, they will curse you. If they are brave, they will add a PK (and wait for a long time for the DB to create the column). If they are not brave or dumb, they will start creating references using the business key (i.e. the data columns) which will cause a maintenance nightmare.

Conclusion: Since the cost of having a PK (even if it's not used ATM) is so small, let it be.

SQL Primary Key - is it necessary?

Always aim to have a primary key.

If you are unsure, have a primary key.

Even if you are 99.99% sure you will not need it, have one. Requirements change as I have learned through experience over many years.

The only examples I can really think of are many-to-many tables with just two foreign_keys and mega-huge (hundreds of millions of rows) tables where every byte counts. But even then a separate, unique, no-business value id key is still strongly recommended.

There's some more great info on this here:

http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx

and here:

http://www.techrepublic.com/article/the-great-primary-key-debate/1045050

here:

http://databases.aspfaq.com/database/what-should-i-choose-for-my-primary-key.html

and here:

Should I use composite primary keys or not?

In your example, I would definitely have one.

The decision to 'not' have one should be based on a very clear need and understanding and actual or predicted (e.g. volume) issues with having one.

One great example of this need comes up when debugging and troubleshooting. Just like having create and update columns in each table (another favorite of mine), this info may not initially be used by/for the front end but boy can it be helpful in tracing and resolving issues. (btw update stamps are often now standard in frameworks like Ruby On Rails which also works well with the convention of every table having an id field!)

Why we need a primary key?

I suppose a primary key can have a not null value only if the column
is declared as not null.But again this is not a feature of primary
key.

Primary key can't have a null values. By definition of primary key, it is UNIQUE and NOT NULL.

My another question is that why do we have a concept of primary key
because I find only one difference between primary key and unique key
is that "Primary key can be declared only on one column whereas unique
key can be declared on multiple columns"

This is completely wrong. You can create primary key on multiple columns also, the difference between Primary Key and Unique Key is Primary Key is not null and Unique key can have null values.

The main purpose of primary key is to identify the uniqueness of a row, where as unique key is to prevent the duplicates, following are the main difference between primary key and unique key.

Primary Key :

  1. There can only be one primary key for a table.
  2. The primary key consists of one or more columns.
  3. The primary key enforces the entity integrity of the table.
  4. All columns defined must be defined as NOT NULL.
  5. The primary key uniquely identifies a row.
  6. Primary keys result in CLUSTERED unique indexes by default.

Unique Key :

  1. There can be multiple unique keys defined on a table.

  2. Unique Keys result in NONCLUSTERED Unique Indexes by default.

  3. One or more columns make up a unique key.

  4. Column may be NULL, but on one NULL per column is allowed.

  5. A unique constraint can be referenced by a Foreign Key Constraint.

I suggest you read this primary key and unique key

Should each and every table have a primary key?

Short answer: yes.

Long answer:

  • You need your table to be joinable on something
  • If you want your table to be clustered, you need some kind of a primary key.
  • If your table design does not need a primary key, rethink your design: most probably, you are missing something. Why keep identical records?

In MySQL, the InnoDB storage engine always creates a primary key if you didn't specify it explicitly, thus making an extra column you don't have access to.

Note that a primary key can be composite.

If you have a many-to-many link table, you create the primary key on all fields involved in the link. Thus you ensure that you don't have two or more records describing one link.

Besides the logical consistency issues, most RDBMS engines will benefit from including these fields in a unique index.

And since any primary key involves creating a unique index, you should declare it and get both logical consistency and performance.

See this article in my blog for why you should always create a unique index on unique data:

  • Making an index UNIQUE

P.S. There are some very, very special cases where you don't need a primary key.

Mostly they include log tables which don't have any indexes for performance reasons.

Why is it a bad idea to have a table without a primary key?

This response is mainly opinion/experience-based, so I'll list a few reasons that come to mind. Note that this is not exhaustive.

Here're some reasons why you should use primary keys (PKs):

  1. They allow you to have a way to uniquely identify a given row in a table to ensure that there're no duplicates.
  2. The RDBMS enforces this constraint for you, so you don't have to write additional code to check for duplicates before inserting, avoiding a full table scan, which implies better performance here.
  3. PKs allow you to create foreign keys (FKs) to create relations between tables in a way that the RDBMS is "aware" of them. Without PKs/FKs, the relationship only exists inside the programmer's mind, and the referenced table might have a row with its "PK" deleted, and the other table with the "FK" still thinks the "PK" exists. This is bad, which leads to the next point.
  4. It allows the RDBMS to enforce integrity constraints. Is TableA.id referenced by TableB.table_a_id? If TableB.table_a_id = 5 then, you're guaranteed to have a row with id = 5 in TableA. Data integrity and consistency is maintained, and that is good.
  5. It allows the RDBMS to perform faster searches b/c PK fields are indexed, which means that a table doesn't need to have all of its rows checked when searching for something (e.g. a binary search on a tree structure).

In my opinion, not having a PK might be legal (i.e. the RDBMS will let you), but it's not moral (i.e. you shouldn't do it). I think you'd need to have extraordinarily good/powerful reasons to argue for not using a PK in your DB tables (and I'd still find them debatable), but based on your current level of experience (i.e. you say you're "new to data modeling"), I'd say it's not yet enough to attempt justifying a lack of PKs.

There're more reasons, but I hope this gives you enough to work through it.

As far as your M:M relations go, you need to create a new table, called an associative table, and a composite PK in it, that PK being a combination of the 2 PKs of the other 2 tables.

In other words, if there's a M:M relation between tables A and B, then we create a table C that has a 1:M relation to with both tables A and B. "Graphically", it'd look similar to:

+---+ 1  M +---+ M  1 +---+
| A |------| C |------| B |
+---+ +---+ +---+

With the C table PK somewhat like this:

+-----+
| C |
+-----+
| id | <-- C.id = A.id + B.id (i.e. combined/concatenated, not addition!)
+-----+

or like this:

+-------+
| C |
+-------+
| a_id | <--|
+-------+ +-- composite PK columns instead
| b_id | <--| of concatenation (recommended)
+-------+

Is a primary key necessary?

This is a subjective question, so I hope you don't mind me answering with some opinion :)

In the vast majority of tables I've made – I'm talking 95%+ – I've added a primary key, and been glad I did. This is either the most critical unique field in my table (think "social security number") or, more often than not, just an auto-incrementing number that allows me to quickly and easily refer to a field when querying.

This latter use is the most common, and it even has its own name: a "surrogate" or "synthetic" key. This is a value auto-generated by the database and not derived from your application data. If you want to add relations between your tables, this surrogate key is immediately helpful as a foreign key. As someone else answered, these keys are so common that MySQL likes to add one even if you don't, so I'd suggest that means the consensus is very heavily biased towards adding primary keys.

One other thing I like about primary keys is that they help convey your intent to others reading your table schemata and also to your DMBS: "this bit is how I intend to identify my rows uniquely, don't let me try to break that rule!"

To answer your question specifically: no, a primary key is not necessary. But realistically if you intend to store data in the table for any period of time beyond a few minutes, I would very strongly recommend you add one.

Creating a SQL database without defining primary key

There are philosophical and practical answers to your question.

The practical answer is that using the primary key constraint enforces "not null", and "unique". This protects you from application-level bugs.

The philosophical answer is that you want developers to operate at the highest possible level of abstraction, so that they don't have to stuff their brain full of detail when trying to solve problems.

Primary and foreign keys are abstractions that allow us to make assumptions about the underlying data model. We can think in terms of (business) entities, and their relationships.

In your workplace, you're forcing developers to think in terms of tables and indexes and conventions. You no longer think about "customers" and "orders" and "line items", but about software artefacts that represent those business entities, and the "we always represent uniqueness by a combination of a GUID and unique index" rule. That mental model is already complicated enough in most applications; you're just making it harder for yourselves, especially when bringing new developers into the team.

Necessary to create index on multi field primary key in SQL server?

If you created your primary key as:

CREATE TABLE TBL (UserID, SomeTypeID, SomeSubType, Data 
CONSTRAINT PK PRIMARY KEY (UserID, SomeTypeID, SomeSubType))

Then the default index that is being created is a CLUSTERED index.

Usually (so not all times), when looking for data, you would want your queries to use a NON-CLUSTERED index to filter rows, where the columns you use to filter rows will form the key of the index and the information (column) that you return from those rows as an INCLUDED column, in this case DATA, like below:

CREATE NONCLUSTERED INDEX ncl_indx 
ON TBL (UserID, SomeTypeID, SomeSubType) INCLUDE (Data);

By doing this, you're avoiding accessing the table data, through the CLUSTERED index.

But, you can specify the type of index that you want your PRIMARY KEY to be, so:

CREATE TABLE TBL (UserID, SomeTypeID, SomeSubType, Data 
CONSTRAINT PK PRIMARY KEY NONCLUSTERED (UserID, SomeTypeID, SomeSubType));

Buuut, because you want this to be defined as a PRIMARY KEY then you are not able to use the INCLUDE functionality, so you can't avoid the disk lookup in order to get the information from the DATA column, which is where you basically are with having the default CLUSTERED index.

Buuuuuut, there's still a way to ensure the uniqueness that the Primary Key gives you and benefit from the INCLUDE functionality, so as to do as fewer disk I/O's.

You can specify your NONCLUSTERED INDEX as UNIQUE which will ensure that all of your 3 columns that make up the index key are unique.

CREATE UNIQUE NONCLUSTERED INDEX ncl_indx 
ON TBL (UserID, SomeTypeID, SomeSubType) INCLUDE (Data);

By doing all of these then your table is going to be a HEAP, which is not a very good thing. If you've given it a good thought in designing your tables and decided that the best clustering key for your CLUSTERED INDEX is (UserID, SomeTypeID, SomeSubType), then it's best to leave everything as you currently have it.

Otherwise, if you have decided on a different clustering key then you can add this unique nonclustered index, if you're going to query the table as you said you will.



Related Topics



Leave a reply



Submit