How to Fix a Slow Implicit Query on Pg_Attribute Table in Rails

How to fix a slow implicit query on pg_attribute table in Rails

In production, each Rails process will run that query once for each table/model it encounters. That's once per rails s, not per request: if you're seeing it repeatedly, I'd investigate whether your processes are being restarted frequently for some reason.

To eliminate those runtime queries entirely, you can generate a schema cache file on your server:

RAILS_ENV=production rails db:schema:cache:dump

(Rails 4: RAILS_ENV=production bin/rake db:schema:cache:dump)

That command will perform the queries immediately, and then write their results to a cache file, which future Rails processes will directly load instead of inspecting the database. Naturally, you'll then need to regenerate the cache after any future database schema changes.

Eager loading generating slower queries

Yes, that query seems to be generated by the pagination plugin. This query is necessary to estimate the total number of pages.

But if you know the number of records anyway (by doing a simple SELECT COUNT(*) FROM "votes" before), you can pass that number to will_paginate with the :total_entries option!

(See WillPaginate::Finder::ClassMethods for more info.)

Btw, have you created an index for votes.user_id? May be that is slowing down the query. I'm wondering why the DISTINCT clause should take up so much time as id probably already has a unique constraint (if not, try adding one).

Unexpected SQL queries to Postgres database on Rails/Heroku

The tables pg_class, pg_attribute, pg_depend etc all describe table, columns and dependencies in postgres. In Rails, model classes are defined by the tables, so Rails reads the tables and columns to figure out the attributes for each model.

In development mode it looks up these values everytime the model is accessed, so if you've mad e a recent change, Rails knows about it. In production mode, Rails caches this so you would see these much less frequently, and so it really isn't a concern.

ActiveRecord query changing when a dot/period is in condition value

The difference between the two strategies for eager loading are discussed in the comments here

https://github.com/rails/rails/blob/3-0-stable/activerecord/lib/active_record/association_preload.rb

From the documentation:

# The second strategy is to use multiple database queries, one for each
# level of association. Since Rails 2.1, this is the default strategy. In
# situations where a table join is necessary (e.g. when the +:conditions+
# option references an association's column), it will fallback to the table
# join strategy.

I believe that the dot in "foo.bar" is causing active record to think that you are putting a condition on a table that is outside of the originating model which prompts the second strategy discussed in the documentation.

The two separate queries runs one with the Person model and the second with the Item model.

 Person.includes(:items).where(:name => 'fubar')

Person Load (0.2ms)  SELECT "people".* FROM "people" WHERE "people"."name" = 'fubar'
Item Load (0.4ms)  SELECT "items".* FROM "items" WHERE ("items".person_id = 1) ORDER BY items.ordinal

Because you run the second query against the Item model, it inherits the default scope where you specified order(:ordinal).

The second query, which it attempts eager loading with the full runs off the person model and will not use the default scope of the association.

 Person.includes(:items).where(:name => 'foo.bar')

Person Load (0.4ms)  SELECT "people"."id" AS t0_r0, "people"."name" AS t0_r1, 
"people"."created_at" AS t0_r2, "people"."updated_at" AS t0_r3, "items"."id" AS t1_r0, 
"items"."person_id" AS t1_r1, "items"."name" AS t1_r2, "items"."ordinal" AS t1_r3, 
"items"."created_at" AS t1_r4, "items"."updated_at" AS t1_r5 FROM "people" LEFT OUTER JOIN 
"items" ON "items"."person_id" = "people"."id" WHERE "people"."name" = 'foo.bar'

It is a little buggy to think that, but I can see how it would be with the several different ways you can present a list of options, the way to be sure that you catch all of them would be to scan the completed "WHERE" conditions for a dot and use the second strategy, and they leave it that way because both strategies are functional. I would actually go as far as saying that the aberrant behavior is in the first query, not the second. If you would like the ordering to persist for this query, I recommend one of the following:

1) If you want the association to have an order by when it is called, then you can specify that with the association. Oddly enough, this is in the documentation, but I could not get it to work.

Source: http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html#method-i-has_many

class Person < ActiveRecord::Base
  has_many :items, :order => 'items.ordinal'
end

2) Another method would be to just add the order statement to the query in question.

Person.includes(:items).where(:name => 'foo.bar').order('items.ordinal')

3) Along the same lines would be setting up a named scope

class Person < ActiveRecord::Base
  has_many :items
  named_scope :with_items, includes(:items).order('items.ordinal')
end

And to call that:

Person.with_items.where(:name => 'foo.bar')

Rails: Default sort order for a rails model?

`default_scope`

This works for Rails 4+:

class Book < ActiveRecord::Base
  default_scope { order(created_at: :desc) }
end

For Rails 2.3, 3, you need this instead:

default_scope order('created_at DESC')

For Rails 2.x:

default_scope :order => 'created_at DESC'

Where created_at is the field you want the default sorting to be done on.

Note: ^ASC is the code to use for Ascending and _DESC is for descending (desc, NOT dsc !).

`scope`

Once you're used to that you can also use scope:

class Book < ActiveRecord::Base
  scope :confirmed, :conditions => { :confirmed => true }
  scope :published, :conditions => { :published => true }
end

For Rails 2 you need named_scope.

:published scope gives you Book.published instead of
Book.find(:published => true).

Since Rails 3 you can 'chain' those methods together by concatenating them with periods between them, so with the above scopes you can now use Book.published.confirmed.

With this method, the query is not actually executed until actual results are needed (lazy evaluation), so 7 scopes could be chained together but only resulting in 1 actual database query, to avoid performance problems from executing 7 separate queries.

You can use a passed in parameter such as a date or a user_id (something that will change at run-time and so will need that 'lazy evaluation', with a lambda, like this:

scope :recent_books, lambda 
  { |since_when| where("created_at >= ?", since_when) }
  # Note the `where` is making use of AREL syntax added in Rails 3.

Finally you can disable default scope with:

Book.with_exclusive_scope { find(:all) }

or even better:

Book.unscoped.all

which will disable any filter (conditions) or sort (order by).

Note that the first version works in Rails2+ whereas the second (unscoped) is only for Rails3+

So
... if you're thinking, hmm, so these are just like methods then..., yup, that's exactly what these scopes are!

They are like having def self.method_name ...code... end but as always with ruby they are nice little syntactical shortcuts (or 'sugar') to make things easier for you!

In fact they are Class level methods as they operate on the 1 set of 'all' records.

Their format is changing however, with rails 4 there are deprecation warning when using #scope without passing a callable object. For example scope :red, where(color: 'red') should be changed to scope :red, -> { where(color: 'red') }.

As a side note, when used incorrectly, default_scope can be misused/abused.

This is mainly about when it gets used for actions like where's limiting (filtering) the default selection (a bad idea for a default) rather than just being used for ordering results.

For where selections, just use the regular named scopes. and add that scope on in the query, e.g. Book.all.published where published is a named scope.

In conclusion, scopes are really great and help you to push things up into the model for a 'fat model thin controller' DRYer approach.

PostgreSQL DISTINCT ON with different ORDER BY

Documentation says:

DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. [...] Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. [...] The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).

Official documentation

So you'll have to add the address_id to the order by.

Alternatively, if you're looking for the full row that contains the most recent purchased product for each address_id and that result sorted by purchased_at then you're trying to solve a greatest N per group problem which can be solved by the following approaches:

The general solution that should work in most DBMSs:

SELECT t1.* FROM purchases t1
JOIN (
    SELECT address_id, max(purchased_at) max_purchased_at
    FROM purchases
    WHERE product_id = 1
    GROUP BY address_id
) t2
ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at
ORDER BY t1.purchased_at DESC

A more PostgreSQL-oriented solution based on @hkf's answer:

SELECT * FROM (
  SELECT DISTINCT ON (address_id) *
  FROM purchases 
  WHERE product_id = 1
  ORDER BY address_id, purchased_at DESC
) t
ORDER BY purchased_at DESC

Problem clarified, extended and solved here: Selecting rows ordered by some column and distinct on another

How to Fix a Slow Implicit Query on Pg_Attribute Table in Rails