Index on multiple columns in Ruby on Rails
The order does matter in indexing.
- Put the most selective field first, i.e. the field that narrows down the number of rows fastest.
- The index will only be used insofar as you use its columns in sequence starting at the beginning. i.e. if you index on
[:user_id, :article_id]
, you can perform a fast query onuser_id
oruser_id AND article_id
, but NOT onarticle_id
.
Your migration add_index
line should look something like this:
add_index :user_views, [:user_id, :article_id]
Question regarding 'unique' option
An easy way to do this in Rails is to use validates
in your model with scoped uniqueness
as follows (documentation):
validates :user, uniqueness: { scope: :article }
Index for multiple columns in ActiveRecord
You are comparing a composite index with a set of independent indices. They are just different.
Think of it this way: a compound index gives you rapid look-up of the first field in a nested set of fields followed by rapid look-up of the second field within ONLY the records already selected by the first field, followed by rapid look-up of the third field - again, only within the records selected by the previous two indices.
Lets take an example. Your database engine will take no more than 20 steps to locate a unique value within 1,000,000 records (if memory serves) if you are using an index. This is true whether you are using a composite or and independent index - but ONLY for the first field ("species" in your example although I'd think you'd want Family, Species, and then Common Name).
Now, let's say that there are 100,000 matching records for this first field value. If you have only single indices, then any lookup within these records will take 100,000 steps: one for each record retrieved by the first index. This is because the second index will not be used (in most databases - this is a bit of a simplification) and a brute force match must be used.
If you have a composite index then your search is much faster because your second field search will have an index within the first set of values. In this case you'll need no more than 17 steps to get to your first matching value on field 2 within the 100,000 matches on field 1 (log base 2 of 100,000).
So: steps needed to find a unique record out of a database of 1,000,000 records using a composite index on 3 nested fields where the first retrieves 100,000 and the second retrieves 10,000 = 20 + 17 + 14 = 51 steps.
Steps needed under the same conditions with just independent indices = 20 + 100,000 + 10,000 = 110,020 steps.
Big difference, eh?
Now, don't go nuts putting composite indices everywhere. First, they are expensive on inserts and updates. Second, they are only brought to bear if you are truly searching across nested data (for another example, I use them when pulling data for logins for a client over a given date range). Also, they are not worth it if you are working with relatively small data sets.
Finally, check your database documentation. Databases have grown extremely sophisticated in the ability to deploy indices these days and the Database 101 scenario I described above may not hold for some (although I always develop as if it does just so I know what I am getting).
How to implement a unique index on two columns in rails
add_index :subscriptions, [:user_id, :content_id], unique: true
How to specify a multiple column index correctly in Rails
For MySQL :
MySQL will be able to use the index [:foo_column, :bar_column] to query for conditions on both columns, and also for conditions on the left column only, but NOT the right column.
More info here : http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html
So you should do
add_index :the_table, [:foo_column, :bar_column], :unique => true
add_index :the_table, :bar_column
To make sure you index everything properly
MySQL indexes columns left-to-right so if you have a multi-column index like this : [:col1, :col2, :col3, :col4]
, you can query this index on :
- col1
- col1 + col2
- col1 + col2 + col3
- col1 + col2 + col3 + col4
So you can query the left-most columns
If you need anything else, you'll have to create more indexes
Again, that's only for MySQL, postgres may work differently
Ruby rails generate migration commond to add Index on multiple columns in Ruby on Rails
Most migrations cannot be generated via the CLI. Instead you should just generate an empty migrate, and fill in the change
method by hand.
rails generate migration AddIndexToPhoneToUsers
How to add a unique index between two references columns in Rails Migration
Try to the following:
create_table :product_attr_vals do |t|
t.references :product, foreign_key: true
t.references :attr_val, foreign_key: true
t.timestamps
end
# Add this line
add_index :product_attr_vals, [:product_id, : attr_val_id], unique: true
Separate indexes on two columns and unique constraint on the pair in Rails
As Erwin points out, the "Key (tag_id, quote_id)=(10, 1) is duplicated" constraint violation error message tells you that your unique constraint is already violated by your existing data. I infer from what's visible of your model that different users can each introduce a common association between a tag and a quote, so you see duplicates when you try to constrain uniqueness for just the quote_id,tag_id pair. Compound indexes are still useful for index access on leading keys (though slightly less efficiently than a single column index since the compound index will have lower key-density). You could probably get the speed you require along with the appropriate unique constraint with two indexes, a single column index on one of the ids and a compound index on all three ids with the other id as its leading field. If mapping from tag to quote was a more frequent access path than mapping from quote to tag, I would try this:
add_index :tag_assignments, :tag_id
add_index :tag_assignments, [:quote_id,:tag_id,:user_id], unique: true
If you're using Pg >= 9.2, you can take advantage of 9.2's index visibility maps to enable index-only scans of covering indexes. In this case there may be benefit to making the first index above contain all three ids, with tag_id and quote_id leading:
add_index :tag_assignments, [:tag_id,:quote_id,user_id]
It's unclear how user_id constrains your queries, so you may find that you want indexes with its position promoted earlier as well.
What does add_index with multiple values do in a Rails migration?
It adds a multi-column index on columns one
and two
in resources
table.
The statement:
add_index :resources, [:one, :two], name: 'index_resources_one_two'
is equivalent to:
create index index_resources_one_two on resources(one, two)
Passing in a single column would only create index on that column only. For example the following line:
add_index :resources, :one, name: 'index_resources_one'
is equivalent to:
create index index_resources_one on resources(one)
The advantage of multi-column index is that it helps when you have a query with conditions on those multiple columns.
With multi-column index the query is worked on a smaller subset of data as compared to single column index when the query contains conditions on those multiple columns.
Say for example our resources table contains the following rows:
one, two
1, 1
1, 1
1, 3
1, 4
The following query:
select * from resources where one = 1 and two = 1;
would only have to work on the following two rows if a multi-column index is defined:
one, two
1, 1
1, 1
But, without the multi-column index, say for example there is an index on only one
then the query would have to work on all the rows with one
equal to 1
which is four rows.
Related Topics
Comma-Separated Value Insertion in SQL Server 2005
Informix 7.3 Isql Insert Statement - Text/Blob/Clob Field Insert Error
Divide the Table Data Randomly Based on Percentages
Dynamic SQL Column Value Duplicate and Difference Detection Merge Query
Selecting Distinct Combinations
Most Recent Record in a Left Join
Error Trapping Code Using Ado Connections
SQL Management Studio Won't Recognize a Table Exists After Scripted Create
Insert Xml into SQL Server 2008 Database
Db2 - Returning the Top 5 of Each Category
What Is the Scala Type Mapping for All Spark SQL Datatype
How to Select Top X But Still Get a Count of the Whole Query