SQL Primary Key and Index

sql primary key and index

You are right, it's confusing that SQL Server allows you to create duplicate indexes on the same field(s). But the fact that you can create another doesn't indicate that the PK index doesn't also already exist.

The additional index does no good, but the only harm (very small) is the additional file size and row-creation overhead.

When should I use primary key or index?

Basically, a primary key is (at the implementation level) a special kind of index. Specifically:

  • A table can have only one primary key, and with very few exceptions, every table should have one.
  • A primary key is implicitly UNIQUE - you cannot have more than one row with the same primary key, since its purpose is to uniquely identify rows.
  • A primary key can never be NULL, so the row(s) it consists of must be NOT NULL

A table can have multiple indexes, and indexes are not necessarily UNIQUE. Indexes exist for two reasons:

  • To enforce a uniquness constraint (these can be created implicitly when you declare a column UNIQUE)
  • To improve performance. Comparisons for equality or "greater/smaller than" in WHERE clauses, as well as JOINs, are much faster on columns that have an index. But note that each index decreases update/insert/delete performance, so you should only have them where they're actually needed.

Difference between Primary Key and Unique Index in SQL Server

From SQL UNIQUE Constraint

The UNIQUE constraint uniquely identifies each record in a database
table.

The UNIQUE and PRIMARY KEY constraints both provide a
guarantee for uniqueness for a column or set of columns.

A PRIMARY
KEY constraint automatically has a UNIQUE constraint defined on it.

Note that you can have many UNIQUE constraints per table, but only one
PRIMARY KEY constraint per table.

Also, from Create Unique Indexes

You cannot create a unique index on a single column if that column
contains NULL in more than one row. Similarly, you cannot create a
unique index on multiple columns if the combination of columns
contains NULL in more than one row. These are treated as duplicate
values for indexing purposes.

Whereas from Create Primary Keys

All columns defined within a PRIMARY KEY constraint must be defined as
NOT NULL. If nullability is not specified, all columns participating
in a PRIMARY KEY constraint have their nullability set to NOT NULL.

Is a primary key automatically an index?

Every table needs a clustered index and a primary key. By default, the PK is clustered but it can be non-clustered if you want like you're done.

You have specified a non-clustered PK so the profiler suggests a clustered index...

Note: a table without a clustered index is called a "heap" because it's a pile of unstructured data...

primary key is always indexed in sql server?

Is it possible a primary key is not indexed?

No, it's not.

Some kind of an index is required to police the PRIMARY KEY, otherwise it would require scanning the whole table on each insert (to ensure uniqueness).

From the docs:

The Database Engine automatically creates a unique index to enforce the uniqueness requirement of the PRIMARY KEY constraint. If a clustered index does not already exist on the table or a nonclustered index is not explicitly specified, a unique, clustered index is created to enforce the PRIMARY KEY constraint.


Does this mean a unique constraint has to create a clustered index or a nonclustered index?

Yes, for the same reasons.


PRIMARY KEY and UNIQUE are logical concepts, while the index just has a side effect of implementing them efficiently. So, strictly speaking, ALTER TABLE ADD CONSTRAINT UNIQUE and CREATE UNIQUE INDEX mean different things.

If, in the future, some bright head invents a way to use Pauli principle or quantum entanglement or whatever to enforce uniqueness on physical level, SQL Server 2155 may employ it to enforce the constraint, but it will still have to create the B-Tree for the index.

Relationship of Primary Key and Clustered Index

A primary key is a logical concept - it's the unique identifier for a row in a table. As such, it has a bunch of attributes - it may not be null, and it must be unique. Of course, as you're likely to be searching for records by their unique identifier a lot, it would be good to have an index on the primary key.

A clustered index is a physical concept - it's an index that affects the order in which records are stored on disk. This makes it a very fast index when accessing data, though it may slow down writes if your primary key is not a sequential number.

Yes, you can have a primary key without a clustered index - and sometimes, you may want to (for instance when your primary key is a combination of foreign keys on a joining table, and you don't want to incur the disk shuffle overhead when writing).

Yes, you can create a clustered index on columns that aren't a primary key.

What is the difference between a primary key and a index key

A primary key is a special kind of index in that:

  • there can be only one;
  • it cannot be nullable; and
  • it must be unique.

You tend to use the primary key as the most natural unique identifier for a row (such as social security number, employee ID and so forth, although there is a school of thought that you should always use an artificial surrogate key for this).

Indexes, on the other hand, can be used for fast retrieval based on other columns. For example, an employee database may have your employee number as the primary key but it may also have an index on your last name or your department.

Both of these indexes (last name and department) would disallow NULLs (probably) and allow duplicates (almost certainly), and they would be useful to speed up queries looking for anyone with (for example) the last name 'Corleone' or working in the 'HitMan' department.

Is primary key also index?

  1. In MySQL a PRIMARY or UNIQUE KEY creates an index on the columns defined in the constraint. If there are multiple columns a composite index is created.

    If its an InnoDB table the PRIMARY KEY also becomes the clustered index for the table.

  2. It doesn't make sense to add additional indexes with the same definitions as a PRIMARY/UNIQUE.

For other RDBMS an index will be required for these constraints. Even if you are allowed to create a constraint without an appropriate index, it will be required to get any reasonable performance.

SQL Server indexes still needed when primary key defined on same columns?

When I have a primary key on a column, do I also need a non-clustered index on that same column for querying purposes? Primary keys ARE indexes, aren't they?

A regular index is a sorted copy of one (or multiple) columns. Being sorted it allows for fast searching. If its underlying values change, it will be re-sorted accordingly, but physical table order stays the same.

A clustered index on the other hand defines physical table order. That's why you only can have one - if its values change, the entire table will be re-sorted accordingly.

Often the primary key also is the clustered index of the table. But not necessarily - the defining property of a primary key is its uniqueness.

Having a clustered and a non-clustered index over the same column is redundant and you should not do it. It increases workload during insert/update/delete, but it does nothing for query performance.

if I have an aggregate primary key on two columns, do I need to create indexes on both of those columns for querying purposes?

That depends whether you ever want to query the second column on its own. An index over (A, B) will do nothing for queries that search for B only, so having a second index over B will be necessary in this case.

Include in the index any extra columns you want to return from the query. If set up smartly, a query can be satisfied by the index alone, saving the DB engine from having to look at the table at all.

Note that this applies to non-clustered indexes. Including extra columns for queries against the clustered index is not necessary, as the clustered index is the table. It naturally contains all columns.

if I will be commonly querying for rows specifying two columns to match, is it best to have one index that includes both columns? Or two separate indexes, one on each?

Have a single index that contains both columns, the most selective (highest variance on unique values) or one that you are most likely to query on its own first, the assisting value second. Sometimes it's necessary to have it both ways - (A, B) and (B, A), it entirely depends on how the table is used.



Related Topics



Leave a reply



Submit