Performance Tuning on Inner Join with Between Condition

Performance difference: condition placed at INNER JOIN vs WHERE clause

The reason that you're seeing a difference is due to the execution plan that the planner is putting together, this is obviously different depending on the query (arguably, it should be optimising the 2 queries to be the same and this may be a bug). This means that the planner thinks it has to work in a particular way to get to the result in each statement.

When you do it within the JOIN, the planner will probably have to select from the table, filter by the "True" part, then join the result sets. I would imagine this is a large table, and therefore a lot of data to look through, and it can't use the indexes as efficiently.

I suspect that if you do it in a WHERE clause, the planner is choosing a route that is more efficient (ie. either index based, or pre filtered dataset).

You could probably make the join work as fast (if not faster) by adding an index on the two columns (not sure if included columns and multiple column indexes are supported on Postgres yet).

In short, the planner is the problem it is choosing 2 different routes to get to the result sets, and one of those is not as efficient as the other. It's impossible for us to know what the reasons are without the full table information and the EXPLAIN ANALYZE information.

If you want specifics on why your specific query is doing this, you'll need to provide more information. However the reason is the planner choosing different routes.

Additional Reading Material:

http://www.postgresql.org/docs/current/static/explicit-joins.html

Just skimmed, seems that the postgres planner doesn't re-order joins to optimise it. try changing the order of the joins in your statement to see if you then get the same performance... just a thought.

SQL inner join - performance improvement

Another way to rewrite your query as using join, Move your dependent sub query part to sub clause and join this with your main query.

select m.traceid, f.name, f.flowid, m.traceday, m.logtimestamp
from flow f
inner join messageinfo m on m.flowid = f.flowid
inner join (
select flowid, max(traceid) traceid
from messageinfo
group by flowid
) m1 on m.flowid = m1.flowid and m.traceid = m1.traceid
order by f.name

Also add composite index on (flowid,traceid)

Performance of inner join compared to cross join

Cross Joins produce results that consist of every combination of rows from two or more tables. That means if table A has 6 rows and table B has 3 rows, a cross join will result in 18 rows. There is no relationship established between the two tables – you literally just produce every possible combination.

With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data.

If a WHERE clause is added to a cross join, it behaves as an inner join as the WHERE imposes a limiting factor.

As long as your queries abide by common sense and vendor specific performance guidelines (i), I like to think of the decision on which type of join to use to be a simple matter of taste.

(i) Vendor Specific Performance Guidelines

  1. MySQL Performance Tuning and Optimization Resources
  2. PostgreSQL Performance Optimization

Does INNER JOIN performance depends on order of tables?

Aliases, and the order of the tables in the join (assuming it's INNER JOIN) doesn't affect the final outcome and thus doesn't affect performance since the order is replace (if needed) when the query is executed.

You can read some more basic concepts about relational algebra here:
http://en.wikipedia.org/wiki/Relational_algebra#Joins_and_join-like_operators

does the order of condition in join affect query performance?

The order of conditions in the on clause should not affect performance. Why not? At a high level are three steps to SQL execution:

  1. Parse the query
  2. Construct and optimize the "executable" code
  3. Execute the code

The second level optimizes the query and should take into account different methods of executing the query. The join conditions are part of this optimization -- all at once.

In theory, it does not matter what the order of the joins are either, although in a very complex query, it could matter.



Related Topics



Leave a reply



Submit