Does Introducing Foreign Keys to MySQL Reduce Performance

Does introducing foreign keys to MySQL reduce performance

Assuming:

  1. You are already using a storage engine that supports FKs (ie: InnoDB)
  2. You already have indexes on the columns involved

Then I would guess that you'll get better performance by having MySQL enforce integrity. Enforcing referential integrity, is, after all, something that database engines are optimized to do. Writing your own code to manage integrity in Ruby is going to be slow in comparison.

If you need to move from MyISAM to InnoDB to get the FK functionality, you need to consider the tradeoffs in performance between the two engines.

If you don't already have indicies, you need to decide if you want them. Generally speaking, if you're doing more reads than writes, you want (need, even) the indicies.

Stacking an FK on top of stuff that is currently indexed should cause less of an overall performance hit than implementing those kinds of checks in your application code.

Does Foreign Key improve query performance?

Foreign Keys are a referential integrity tool, not a performance tool. At least in SQL Server, the creation of an FK does not create an associated index, and you should create indexes on all FK fields to improve look up times.

Foreign keys when cascades aren't needed

You must to do it. If it will touch performance in write -- it's a "pixel" problem.

Main performance problems are in read -- FKs could help query optimizer to select best plan and etc. Even if you DBMS(-s) (if you provide cross-DBMS solution) will gain from it now -- it can happen later.

So answer is -- yes, it's not only aestetics.

Can foreign keys help me, or should I consider a new database schema?

Your current data model is not optimal. To avoid the overhead of before insert trigger (and the additional query) you should introduce a new table, called reservations. The userID column does not belong to the cars table - to indicate who is currently using the car you will use the new reservations table which should have the following columns:

  1. car_id - integer, FK into cars table
  2. user_id - integer, FK into users table
  3. usage_period - tstzrange

There should be covered by an exclusion constraint

EXCLUDE USING GIST (car_id WITH =, user_id WITH =, usage_period WITH &&)

to prevent using the same car by multiple users at the same time or the same user driving different cars at the same time.

When you need to know who was driving the car at a given location (or a list of location points) - you will simply join with the reservations table.

mysql multiple foreign key vs inner join

Copying the value of an attribute from the user table into the hobby table isn't a "foreign key", that's redundancy.

Our performance objectives are not usually met with an approach of avoiding JOIN operations, which are a normal part of how relational databases operate.

I'd go with the normalized design as a first cut. Each attribute should be dependent on the key, the whole key, and nothing but the key. The "firstname" attribute is dependent on the id of the user, not the hobby.

Sometimes, we do gain performance benefits by introducing redundancy into the database. We have to do that in a controlled way, and make sure that we don't get update anomalies. (Consider what changes we want to apply if the value of "firstname" attribute is updated... do we make that change to the user table, the user_hobby table, or both.

Likely, "firstname" is not unique in the user table, so we definitely don't want a foreign key referencing that column; we want foreign keys that reference the user table to reference the PRIMARY KEY of the table.

There's no point in having two foreign keys defined between user_hobby and user, if a user_hobby is related to exactly one user. We only need one foreign key... we just store the id from the user table in the user_hobby table.

Why are foreign keys more used in theory than in practice?

The reason foreign key constraints exist is to guarantee that the referenced rows exist.

"The foreign key identifies a column or a set of columns in one table that refers to a column or set of columns in another table. The values in one row of the referencing columns must occur in a single row in the referenced table.

Thus, a row in the referencing table cannot contain values that don't exist in the referenced table (except potentially NULL). This way references can be made to link information together and it is an essential part of database normalization." (Wikipedia)


RE: Your question: "I can't imagine the need to join tables by fields that aren't FKs":

When defining a Foreign Key constraint, the column(s) in the referencing table must be the primary key of the referenced table, or at least a candidate key.

When doing joins, there is no need to join with primary keys or candidate keys.

The following is an example that could make sense:

CREATE TABLE clients (
client_id uniqueidentifier NOT NULL,
client_name nvarchar(250) NOT NULL,
client_country char(2) NOT NULL
);

CREATE TABLE suppliers (
supplier_id uniqueidentifier NOT NULL,
supplier_name nvarchar(250) NOT NULL,
supplier_country char(2) NOT NULL
);

And then query as follows:

SELECT 
client_name, supplier_name, client_country
FROM
clients
INNER JOIN
suppliers ON (clients.client_country = suppliers.supplier_country)
ORDER BY
client_country;

Another case where these joins make sense is in databases that offer geospatial features, like SQL Server 2008 or Postgres with PostGIS. You will be able to do queries like these:

SELECT
state, electorate
FROM
electorates
INNER JOIN
postcodes on (postcodes.Location.STIntersects(electorates.Location) = 1);

Source: ConceptDev - SQL Server 2008 Geography: STIntersects, STArea

You can see another similar geospatial example in the accepted answer to the post "Sql 2008 query problem - which LatLong’s exists in a geography polygon?":

SELECT 
G.Name, COUNT(CL.Id)
FROM
GeoShapes G
INNER JOIN
CrimeLocations CL ON G.ShapeFile.STIntersects(CL.LatLong) = 1
GROUP BY
G.Name;

These are all valid SQL joins that have nothing to do with foreign keys and candidate keys, and can still be useful in practice.

Performance of primary/foreign key versus single table with no primary key

If you wanted to find all the keys associated with a given user you might use the following JOIN query:

SELECT Key
FROM keys k INNER JOIN users u
ON k.UserId = u.UserId
WHERE u.UserName = 'username'

The place which would benefit most from an index in this case would be the UserId columns in the two tables. If this index existed, then, for a given user, looking up keys in the Key table would require roughly constant time.

Without any indices, then MySQL will have to do a full table scan for each user, as it tries to find keys corresponding to that user.

Are foreign keys really necessary in a database design?

Foreign keys help enforce referential integrity at the data level. They also improve performance because they're normally indexed by default.

Is it fine to have one table with foreign keys to many other tables?

You are correct that storing 30 columns where one is not NULL is inefficient -- those NULL values typically occupy space in each row.

Assuming all the ids are the same type, you can simplify the structure to use:

table_name
table_id

Just two columns. Unfortunately, I am not aware of any database that allows "conditional" foreign key relationships. Although I advocate defining foreign key relationships this might be a case where you choose not to have them.

That is not fully satisfying. Although you can ensure some integrity using triggers, this lacks the "cascading" features of foreign keys. You can implement that with more triggers. Yuck!

One alternative is a separate notations table for each table:

notations_table1
text, id

notations_table2
text, id

. . .

You can then bring these together using union all:

create view notations as (
select text, id, 'table1' as table_name from notaions_table1 union all
select text, id, 'table2' as table_name from notaions_table2 union all
. . .

The underlying tables can then have properly formatted foreign key constraints -- and even cascading ones. Unfortunately, this lacks a single column as an id. The "real" id is a combination of table_name and id.

Some databases have other mechanisms that might be able to help. For instance, Postgres supports a form of inheritance that might be useful in this context.

How to properly increase MySQL performance by indexing

Try removing order by and do your sorting in your application logic.

Hope it can minimize your query load.



Related Topics



Leave a reply



Submit