Does Foreign Key Improve Query Performance

Does Foreign Key improve query performance?

Foreign Keys are a referential integrity tool, not a performance tool. At least in SQL Server, the creation of an FK does not create an associated index, and you should create indexes on all FK fields to improve look up times.

Does using Foreign Key speed up table joins

Foreign keys do not directly speed up the execution of queries. They do have an indirect effect, because they guarantee that the referenced column is indexed. And the index will have an impact on performance.

As you describe the problem, all the join relationships should include the primary key on one of the tables. The resulting queries should be a very efficient in execution.

I would not worry about 5 or 6 joins for the queries -- unless you have a very large amount of data (more than one table with millions of rows). Or you are in a severely memory-constrained environment.

Does adding a foreign key to an indexed column boost performance?

Yes, foreign keys can definitely improve performance of queries, but it depends on the Database you are using and often whether these keys are 'enforced' or not.

In Oracle and SQL Server having foreign keys definitely can increase performance when reading / joining multiple tables on their foreign key

Why? Having an checked/validated foreign key gives the query optimizer extra information regarding the relation 2 tables have.

It knows, that when a child table is inner joined to a parent table:

  1. That the parent table has the same amount of records or less compared to the child table.
  2. That all keys in the child exists in the parent.

This all helps the query optimizer in estimating the rows that are going to be processed. This esimation being right is really important for most (if not all) query optimizers.

Proof for this general fact can been seen by the recent addition of foreign keys in the form of metadata only to Hadoop Hive. The goal of this addition is do help the CBO (Cost Based Optimizer), this Hive Jira entry explains...

Furthermore, having (bitmap) indexes on foreign keys also improves performance in Oracle when using fact tables:
'A bitmap index should be built on each of the foreign key columns of the fact table or tables'.
See the following link...

Foreign keys, for obvious reasons will cost you extra when inserting / updating data: extra work has to be done by the database compared with NOT having fk's

You can easily see this in SQL server (for example) by investigating Explain plans.

I do not know Postgresql, but my approach to validate the effect of FK's would be to look at explain plans. Do they differ when FK's are enabled / disabled / dropped?

[Edit]
I actually found this proof that FK's can enable read performance in Postgresql but the reason for this is somewhat different: BECAUSE FK's are enabled, the query in the example can be changed to be more performant.

Can foreign keys hurt query performance

I'm assuming that for INSERT queries, constraints - including foreign key constraints - will slow performance somewhat. The database has to check that whatever you've told it to insert is something that your constraints allow it to insert.

For SELECT queries, foreign key constraints shouldn't make any changes to performance.

Since INSERTS are almost always very quick, the small amount of extra time won't be noticeable, except in fringe cases. (Building a several gigabyte database, you might want to disable constraints and then re-enable later, as long as you're sure the data is good.)

Does introducing foreign keys to MySQL reduce performance

Assuming:

  1. You are already using a storage engine that supports FKs (ie: InnoDB)
  2. You already have indexes on the columns involved

Then I would guess that you'll get better performance by having MySQL enforce integrity. Enforcing referential integrity, is, after all, something that database engines are optimized to do. Writing your own code to manage integrity in Ruby is going to be slow in comparison.

If you need to move from MyISAM to InnoDB to get the FK functionality, you need to consider the tradeoffs in performance between the two engines.

If you don't already have indicies, you need to decide if you want them. Generally speaking, if you're doing more reads than writes, you want (need, even) the indicies.

Stacking an FK on top of stuff that is currently indexed should cause less of an overall performance hit than implementing those kinds of checks in your application code.

Is there a severe performance hit for using Foreign Keys in SQL Server?

There is a tiny performance hit on inserts, updates and deletes because the FK has to be checked. For an individual record this would normally be so slight as to be unnoticeable unless you start having a ridiculous number of FKs associated to the table (Clearly it takes longer to check 100 other tables than 2). This is a good thing not a bad thing as databases without integrity are untrustworthy and thus useless. You should not trade integrity for speed. That performance hit is usually offset by the better ability to optimize execution plans.

We have a medium sized database with around 9 million records and FKs everywhere they should be and rarely notice a performance hit (except on one badly designed table that has well over 100 foreign keys, it is a bit slow to delete records from this as all must be checked). Almost every dba I know of who deals with large, terabyte sized databases and a true need for high performance on large data sets insists on foreign key constraints because integrity is key to any database. If the people with terabyte-sized databases can afford the very small performance hit, then so can you.

FKs are not automatically indexed and if they are not indexed this can cause performance problems.

Honestly, I'd take a copy of your database, add properly indexed FKs and show the time difference to insert, delete, update and select from those tables in comparision with the same from your database without the FKs. Show that you won't be causing a performance hit. Then show the results of queries that show orphaned records that no longer have meaning because the PK they are related to no longer exists. It is especially effective to show this for tables which contain financial information ("We have 2700 orders that we can't associate with a customer" will make management sit up and take notice).

How does foreign key resolve queries in terms of performance(indexing) on databases?

It's important to clarify concepts first. A "foreign key" is the column, while a "foreign key constraint" is the integrity rule.

Now, to check the integrity rule engines are more efficient when there are indexes that help finding the related rows fast. In general, heap-based engines such as Oracle, DB2, PostgreSQL don't add the index automatically when you create a foreign key constraint. Clustered-index-based engines like MariaDB, MySQL, and SQL Server do this by default.

Those two models are quite different and in general heap-based engines tend to be more efficient. In these engines, however, the database designer needs to set up the helpful FK indexes manually. If the designer forgets to do this (happens often) then the performance of data modification statements and joins can worsen over time. On the flip side, an experienced designer can add highly customized indexes (to include covering indexes, specific column ordering, expressions, etc.) to serve many solutions with a minimal number of indexes. This requires more knowledge and in my experience most designers may not take full advantage of these features.



Related Topics



Leave a reply



Submit