Should a Composite Primary Key Be Clustered in SQL Server

Should a Composite Primary Key be clustered in SQL Server?

As has already been said by several others, it depends on how you will access the table. Keep in mind though, that any RDBMS out there should be able to use the clustered index for searching by a single column as long as that column appears first. For example, if your clustered index is on (parent_id, child_id) you don't need another separate index on (parent_id).

Your best bet may be a clustered index on (parent_id, child_id), which also happens to be the primary key, with a separate non-clustered index on (child_id).

Ultimately, indexing should be addressed after you've got an idea of how the database will be accessed. Come up with some standard performance stress tests if you can and then analyze the behavior using a profiling tool (SQL Profiler for SQL Server) and performance tune from there. If you don't have the expertise or knowledge to do that ahead of time, then try for a (hopefully limited) release of the application, collect the performance metrics, and see where you need to improve performance and figure out what indexes will help.

If you do things right, you should be able to capture the "typical" profile of how the database is accessed and you can then rerun that over and over again on a test server as you try various indexing approaches.

In your case I would probably just put a clustered PK on (parent_id, child_id) to start with and then add the non-clustered index only if I saw a performance problem that would be helped by it.

Composite Primary Keys : Good or Bad?

There is no conclusion that composite primary keys are bad.

The best practice is to have some column or columns that uniquely identify a row. But in some tables a single column is not enough by itself to uniquely identify a row.

SQL (and the relational model) allows a composite primary key. It is a good practice is some cases. Or, another way of looking at it is that it's not a bad practice in all cases.

Some people have the opinion that every table should have an integer column that automatically generates unique values, and that should serve as the primary key. Some people also claim that this primary key column should always be called id. But those are conventions, not necessarily best practices. Conventions have some benefit, because it simplifies certain decisions. But conventions are also restrictive.

You may have an order with multiple payments because some people purchase on layaway, or else they have multiple sources of payment (two credit cards, for instance), or two different people want to pay for a share of the order (I frequently go to a restaurant with a friend, and we each pay for our own meal, so the staff process half of the order on each of our credit cards).

I would design the system you describe as follows:

Products  : product_id (PK)

Orders : order_id (PK)

LineItems : product_id is (FK) to Products
order_id is (FK) to Orders
(product_id, order_id) is (PK)

Payments : order_id (FK)
payment_id - ordinal for each order_id
(order_id, payment_id) is (PK)

This is also related to the concept of identifying relationship. If it's definitional that a payment exists only because an order exist, then make the order part of the primary key.

Note the LineItems table also lacks its own auto-increment, single-column primary key. A many-to-many table is a classic example of a good use of a composite primary key.

SQL Server - Any advantage of using composite primary key here?

Yes a composite clustered primary key on those two columns in that order would be very good for that query.

It would allow a simple range seek to be used on a covering index. To be declared as a primary key the columns must both not be nullable and the combination must be unique but I assume this is the case.

The rate1, rate2, rate3 looks suspect though and may indicate your table is not in first normal form.

Performance between non-clustered index and composite primary key

This is too long for a comment.

How much slower is "slower"? When searching through a non-clustered index, the database engine needs to find the row references in the index (quite fast) and then load the data pages to fetch the row.

When searching using a clustered index, there is no need to load the data pages.

The difference is likely to be much more noticeable when fetching multiple rows, because the clustered index will have the data on the same data pages. The non-clustered index is likely to be fetching from a different page for each item being retrieved (up to a point).

You can compare the difference in performance by fetching only columns in the index. This might not be what you want, but it is a viable performance comparison. These should be similar between the two indexes.

This might explain the difference in performance. If so, then this is nothing to worry about, because it is the expected overhead when not using a clustered index. In general, this is relatively big for queries that are fast and less important for queries that are slower.

Primay key non-clustered (composite key) and clustered index on different column in same table?

One common scenario where you might end up with a primary key which is a non clustered composite key is a junction table. A junction table mainly exists to store a relationship between two primary key values from other tables. A simple example would be storing say relationships between students and the courses they take. As such, the primary (unique) key in such a table would actually be the combination of the two foreign key columns. That being said, there can still be a clustered index on some other column. There is nothing at all out of the ordinary here, assuming such a table falls in line with your design intentions.

Composite primary key / clustered index, fragmentation, performance

You neen only clustered index including both fields. Index is ordered data whether it is clustered or not.
If you make non-clustered index your data will be doubled and every insert operation will need doubled resources because it will insert data both in heap (or row_id clustered index) and non-clustered index. But seek operation will use only non-clustered index because all needed data is included in it.

So make clustered index and be happy :)

In SQL can a table have both primary keys + composite keys?

Yes.

A table can have more than one Key, and a Key has one or more key columns.

In SQL Server you create a Key with any of a UNIQUE CONSTRAINT, a PRIMARY KEY CONSTRAINT, or a UNIQUE INDEX. A table can at most one PRIMARY KEY CONSTRAINT, but can have any number of UNIQUE CONSTRAINTs or UNIQUE INDEXs.

So yes, a table can have a PRIMARY KEY on one column, and a composite UNIQUE INDEX or UNIQUE CONSTRAINT.



Related Topics



Leave a reply



Submit