Does introducing foreign keys to MySQL reduce performance
Assuming:
- You are already using a storage engine that supports FKs (ie: InnoDB)
- You already have indexes on the columns involved
Then I would guess that you'll get better performance by having MySQL enforce integrity. Enforcing referential integrity, is, after all, something that database engines are optimized to do. Writing your own code to manage integrity in Ruby is going to be slow in comparison.
If you need to move from MyISAM to InnoDB to get the FK functionality, you need to consider the tradeoffs in performance between the two engines.
If you don't already have indicies, you need to decide if you want them. Generally speaking, if you're doing more reads than writes, you want (need, even) the indicies.
Stacking an FK on top of stuff that is currently indexed should cause less of an overall performance hit than implementing those kinds of checks in your application code.
Does Foreign Key improve query performance?
Foreign Keys are a referential integrity tool, not a performance tool. At least in SQL Server, the creation of an FK does not create an associated index, and you should create indexes on all FK fields to improve look up times.
Foreign keys when cascades aren't needed
You must to do it. If it will touch performance in write -- it's a "pixel" problem.
Main performance problems are in read -- FKs could help query optimizer to select best plan and etc. Even if you DBMS(-s) (if you provide cross-DBMS solution) will gain from it now -- it can happen later.
So answer is -- yes, it's not only aestetics.
Can foreign keys help me, or should I consider a new database schema?
Your current data model is not optimal. To avoid the overhead of before insert
trigger (and the additional query) you should introduce a new table, called reservations
. The userID
column does not belong to the cars
table - to indicate who is currently using the car you will use the new reservations
table which should have the following columns:
- car_id - integer, FK into
cars
table - user_id - integer, FK into
users
table - usage_period - tstzrange
There should be covered by an exclusion constraint
EXCLUDE USING GIST (car_id WITH =, user_id WITH =, usage_period WITH &&)
to prevent using the same car by multiple users at the same time or the same user driving different cars at the same time.
When you need to know who was driving the car at a given location (or a list of location points) - you will simply join with the reservations
table.
mysql multiple foreign key vs inner join
Copying the value of an attribute from the user table into the hobby table isn't a "foreign key", that's redundancy.
Our performance objectives are not usually met with an approach of avoiding JOIN operations, which are a normal part of how relational databases operate.
I'd go with the normalized design as a first cut. Each attribute should be dependent on the key, the whole key, and nothing but the key. The "firstname" attribute is dependent on the id of the user, not the hobby.
Sometimes, we do gain performance benefits by introducing redundancy into the database. We have to do that in a controlled way, and make sure that we don't get update anomalies. (Consider what changes we want to apply if the value of "firstname" attribute is updated... do we make that change to the user table, the user_hobby table, or both.
Likely, "firstname" is not unique in the user table, so we definitely don't want a foreign key referencing that column; we want foreign keys that reference the user table to reference the PRIMARY KEY of the table.
There's no point in having two foreign keys defined between user_hobby and user, if a user_hobby is related to exactly one user. We only need one foreign key... we just store the id from the user table in the user_hobby table.
Why are foreign keys more used in theory than in practice?
The reason foreign key constraints exist is to guarantee that the referenced rows exist.
"The foreign key identifies a column or a set of columns in one table that refers to a column or set of columns in another table. The values in one row of the referencing columns must occur in a single row in the referenced table.
Thus, a row in the referencing table cannot contain values that don't exist in the referenced table (except potentially NULL). This way references can be made to link information together and it is an essential part of database normalization." (Wikipedia)
RE: Your question: "I can't imagine the need to join tables by fields that aren't FKs":
When defining a Foreign Key constraint, the column(s) in the referencing table must be the primary key of the referenced table, or at least a candidate key.
When doing joins, there is no need to join with primary keys or candidate keys.
The following is an example that could make sense:
CREATE TABLE clients (
client_id uniqueidentifier NOT NULL,
client_name nvarchar(250) NOT NULL,
client_country char(2) NOT NULL
);
CREATE TABLE suppliers (
supplier_id uniqueidentifier NOT NULL,
supplier_name nvarchar(250) NOT NULL,
supplier_country char(2) NOT NULL
);
And then query as follows:
SELECT
client_name, supplier_name, client_country
FROM
clients
INNER JOIN
suppliers ON (clients.client_country = suppliers.supplier_country)
ORDER BY
client_country;
Another case where these joins make sense is in databases that offer geospatial features, like SQL Server 2008 or Postgres with PostGIS. You will be able to do queries like these:
SELECT
state, electorate
FROM
electorates
INNER JOIN
postcodes on (postcodes.Location.STIntersects(electorates.Location) = 1);
Source: ConceptDev - SQL Server 2008 Geography: STIntersects, STArea
You can see another similar geospatial example in the accepted answer to the post "Sql 2008 query problem - which LatLong’s exists in a geography polygon?":
SELECT
G.Name, COUNT(CL.Id)
FROM
GeoShapes G
INNER JOIN
CrimeLocations CL ON G.ShapeFile.STIntersects(CL.LatLong) = 1
GROUP BY
G.Name;
These are all valid SQL joins that have nothing to do with foreign keys and candidate keys, and can still be useful in practice.
Performance of primary/foreign key versus single table with no primary key
If you wanted to find all the keys associated with a given user you might use the following JOIN
query:
SELECT Key
FROM keys k INNER JOIN users u
ON k.UserId = u.UserId
WHERE u.UserName = 'username'
The place which would benefit most from an index in this case would be the UserId
columns in the two tables. If this index existed, then, for a given user, looking up keys in the Key
table would require roughly constant time.
Without any indices, then MySQL will have to do a full table scan for each user, as it tries to find keys corresponding to that user.
Are foreign keys really necessary in a database design?
Foreign keys help enforce referential integrity at the data level. They also improve performance because they're normally indexed by default.
Is it fine to have one table with foreign keys to many other tables?
You are correct that storing 30 columns where one is not NULL
is inefficient -- those NULL
values typically occupy space in each row.
Assuming all the ids are the same type, you can simplify the structure to use:
table_name
table_id
Just two columns. Unfortunately, I am not aware of any database that allows "conditional" foreign key relationships. Although I advocate defining foreign key relationships this might be a case where you choose not to have them.
That is not fully satisfying. Although you can ensure some integrity using triggers, this lacks the "cascading" features of foreign keys. You can implement that with more triggers. Yuck!
One alternative is a separate notations table for each table:
notations_table1
text, id
notations_table2
text, id
. . .
You can then bring these together using union all
:
create view notations as (
select text, id, 'table1' as table_name from notaions_table1 union all
select text, id, 'table2' as table_name from notaions_table2 union all
. . .
The underlying tables can then have properly formatted foreign key constraints -- and even cascading ones. Unfortunately, this lacks a single column as an id
. The "real" id is a combination of table_name
and id
.
Some databases have other mechanisms that might be able to help. For instance, Postgres supports a form of inheritance that might be useful in this context.
How to properly increase MySQL performance by indexing
Try removing order by
and do your sorting in your application logic.
Hope it can minimize your query load.
Related Topics
How to Best Handle the Storage of Historical Data
Multiple Inner Join from The Same Table
Can an SQL Procedure Return a Table
Presto Check If Null and Return Default (Nvl Analog)
Select All Parents or Children in Same Table Relation SQL Server
Selecting The Most Common Value from Relation - SQL Statement
Web-Based, Hosted Admin Tool for SQL Server Database Access
Create View Using Linked Server Db in SQL Server
Sql Server Pivot on Multiple Columns
SQL Query for Courses Enrolment on Moodle
Order by Month and Year in SQL with Sum
Select Same Column from Multiple Tables Only Where Something = Something