Foreign Keys - What Do They Do for Me

Foreign Keys - What do they do for me?

Foreign keys provide referential integrity. The data in a foreign key column is validated - the value can only be one that already exists in the table & column defined in the foreign key. It's very effective at stopping "bad data" - someone can't enter whatever they want - numbers, ASCII text, etc. It means the data is normalized - repeating values have been identified and isolated to their own table, so there's no more concerns about dealing with case sensitivity in text... and the values are consistent. This leads into the next part - foreign keys are what you use to join tables together.

Your query for the projects a user has would not work - you're referencing a column from the USERS table when there's no reference to the table in the query, and there's no subquery being used to get that information before linking it to the PROJECTS table. What you'd really use is:

SELECT p.*
FROM PROJECTS p
JOIN USERS u ON u.user_id = p.creator
WHERE u.username = 'John Smith'

Can someone explain what a Foreign Key is, and why you use it?

In the context of relational databases, a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table or the same table. In simpler words, the foreign key is defined in a second table, but it refers to the primary key or a unique key in the first table.

Sample Image

This takes us to Primary key. The customers table contains a unique key on each row called customerNumber this is the primary key for the table. On the orders table we have the orderNumber column which is the primary key for that table.

The orders table has a foreign key link back to the customers table though the customer Number. We call the customer Number the foreign key.

Customer Table:

customerNumber CustomerName.
1 Bob
2 Jane

Order table:

OrderNumber customerNumber   Status
1 1 Shipped
2 1 Exploded

Using the data above if you wanted to see what orders bob had you would take the primary key being bobs customer id and check the order table for all rows containing the it. This is the foreign key linking two tables.

Are foreign keys really necessary in a database design?

Foreign keys help enforce referential integrity at the data level. They also improve performance because they're normally indexed by default.

Why are foreign keys more used in theory than in practice?

The reason foreign key constraints exist is to guarantee that the referenced rows exist.

"The foreign key identifies a column or a set of columns in one table that refers to a column or set of columns in another table. The values in one row of the referencing columns must occur in a single row in the referenced table.

Thus, a row in the referencing table cannot contain values that don't exist in the referenced table (except potentially NULL). This way references can be made to link information together and it is an essential part of database normalization." (Wikipedia)


RE: Your question: "I can't imagine the need to join tables by fields that aren't FKs":

When defining a Foreign Key constraint, the column(s) in the referencing table must be the primary key of the referenced table, or at least a candidate key.

When doing joins, there is no need to join with primary keys or candidate keys.

The following is an example that could make sense:

CREATE TABLE clients (
client_id uniqueidentifier NOT NULL,
client_name nvarchar(250) NOT NULL,
client_country char(2) NOT NULL
);

CREATE TABLE suppliers (
supplier_id uniqueidentifier NOT NULL,
supplier_name nvarchar(250) NOT NULL,
supplier_country char(2) NOT NULL
);

And then query as follows:

SELECT 
client_name, supplier_name, client_country
FROM
clients
INNER JOIN
suppliers ON (clients.client_country = suppliers.supplier_country)
ORDER BY
client_country;

Another case where these joins make sense is in databases that offer geospatial features, like SQL Server 2008 or Postgres with PostGIS. You will be able to do queries like these:

SELECT
state, electorate
FROM
electorates
INNER JOIN
postcodes on (postcodes.Location.STIntersects(electorates.Location) = 1);

Source: ConceptDev - SQL Server 2008 Geography: STIntersects, STArea

You can see another similar geospatial example in the accepted answer to the post "Sql 2008 query problem - which LatLong’s exists in a geography polygon?":

SELECT 
G.Name, COUNT(CL.Id)
FROM
GeoShapes G
INNER JOIN
CrimeLocations CL ON G.ShapeFile.STIntersects(CL.LatLong) = 1
GROUP BY
G.Name;

These are all valid SQL joins that have nothing to do with foreign keys and candidate keys, and can still be useful in practice.

SQL Server Foreign Key constraint benefits

  • Foreign keys provide no performance or scalability benefits.
  • Foreign keys enforce referential integrity. This can provide a practical benefit by raising an error if someone attempted to delete rows from the parent table in error.
  • Foreign keys are not indexed by default. You should index your foreign keys columns, as this avoids a table scan on the child table when you delete/update your parent row.
  • You can make a foreign key column nullable and insert null.

Data Mart design - Best practice - Why are foreign keys not used?

Foreign keys are constraints used to ensure consistency of data in a database - their purpose is not to document the structure of your database, rather it is to enforce data consistency rules by controlling what changes are allowed to the database.

This is all good in a live database where data integrity is key, but in a datamart there is no need to enforce these rules - we know the data is consistent because it's a copy / extract of the live database where these rules are enforced.

Foreign keys also come with some disadvantages:

  • They complicate the datamart extract process (you need to ensure that data is extracted in a certain order)
  • They prevent partial exports (where you export only certain tables from your database)
  • They also incur a runtime performance penalty when making changes to the database as the database server has to check / validate each constraint as changes are made

In short, they reduce performance and provide no real benefit, so why bother? Just make sure that your datamart is documented properly elsewhere.

You might be interested in these questions:

  • How to document a database
  • How do you document your database structure?

what are the advantages of defining a foreign key

Foreign keys with constraints(in some DB engines) give you data integrity on the low level(level of database).
It means you can't physically create a record that doesn't fulfill relation.
It's just a way to be more safe.

What can I do with a foreign key that I can't with JOIN in a SQL statement?

Relationships between rows of two tables can be established by storing a "common value" in columns of each table. (This is a fundamental tenet of relational database theory.)

A FOREIGN KEY is an integrity constraint in the database. If there is a foreign key constraint defined (and enforced), the database will prohibit invalid values from being stored in a row (by INSERT and UPDATEstatement, and prevent rows from being removed (by DELETE statement.)

A JOIN operation in a SQL statement just allows us to access multiple tables. Typically, a join operation will include conditions that require a "match" of foreign key in one table with a primary key of another table. But this isn't required. It's possible to "join" tables on a huge variety of conditions, or on no condition at all (CROSS JOIN).

Is it fine to have foreign key as primary key?

Foreign keys are almost always "Allow Duplicates," which would make them unsuitable as Primary Keys.

Instead, find a field that uniquely identifies each record in the table, or add a new field (either an auto-incrementing integer or a GUID) to act as the primary key.

The only exception to this are tables with a one-to-one relationship, where the foreign key and primary key of the linked table are one and the same.



Related Topics



Leave a reply



Submit