Join vs. sub-query
Taken from the MySQL manual (13.2.10.11 Rewriting Subqueries as Joins):
A LEFT [OUTER] JOIN can be faster than an equivalent subquery because the server might be able to optimize it better—a fact that is not specific to MySQL Server alone.
So subqueries can be slower than LEFT [OUTER] JOIN
, but in my opinion their strength is slightly higher readability.
Subqueries vs joins
A "correlated subquery" (i.e., one in which the where condition depends on values obtained from the rows of the containing query) will execute once for each row. A non-correlated subquery (one in which the where condition is independent of the containing query) will execute once at the beginning. The SQL engine makes this distinction automatically.
But, yeah, explain-plan will give you the dirty details.
Understanding when to use a subquery over a join
Good Read for Subquery vs Inner Join
https://www.essentialsql.com/subquery-versus-inner-join/
Normal Join vs Join with Subqueries
In a decent database, there should be no difference between the two queries. Remember, SQL is a descriptive language, not a procedural language. That is, a SQL SELECT
statement describes the result set that should be returned. It does not specify the steps for creating it.
Your two queries are semantically equivalent and the SQL optimizer should be able to recognize that.
Of course, SQL optimizers are not omniscient. So, sometimes how you write a query does affect the execution plan. However, the queries that you are describing are turned into execution plans that have no concept of "subquery", so it is reasonable that they would produce the same execution plan.
Note: Some databases -- such as MySQL and MS Access -- do not have very good optimizers and such queries do produce different execution plans. Alas.
SQL Joins Vs SQL Subqueries (Performance)?
I would EXPECT the first query to be quicker, mainly because you have an equivalence and an explicit JOIN. In my experience IN
is a very slow operator, since SQL normally evaluates it as a series of WHERE
clauses separated by "OR" (WHERE x=Y OR x=Z OR...
).
As with ALL THINGS SQL though, your mileage may vary. The speed will depend a lot on indexes (do you have indexes on both ID columns? That will help a lot...) among other things.
The only REAL way to tell with 100% certainty which is faster is to turn on performance tracking (IO Statistics is especially useful) and run them both. Make sure to clear your cache between runs!
Sub query vs joins performance
I would probably write this query using joins:
SELECT
s.siteid,
COALESCE(si.CountUniquePermissions, 0) AS CountUniquePermissions,
COALESCE(si.CountNotModified30Days, 0) AS CountNotModified30Days
FROM sites s
LEFT JOIN
(
SELECT siteid,
COUNT(CASE WHEN CountUniqueRoleAssignments > 0 THEN 1 END)
AS CountUniquePermissions,
COUNT(CASE WHEN Modified < DATEADD (day, -30, GETDATE()) THEN 1 END)
AS CountNotModified30Days
FROM ScannedItems
GROUP BY siteid
) si
ON si.siteid = s.siteid
ORDER BY
s.siteid;
The above query has no WHERE
or HAVING
clauses, and so I don't see any obvious way to tune it further using indices. But it at least has the potential advantage over your current query that it doesn't involve N^2
behavior with correlated subqueries in the select clause.
Related Topics
Simple Way to Transpose Columns and Rows in Sql
What Is the Reason Not to Use Select *
SQL Join - Where Clause Vs. on Clause
Postgresql Unnest() With Element Number
How Stuff and 'For Xml Path' Work in SQL Server
How to Reset Auto_Increment in MySQL
Sql, Auxiliary Table of Numbers
When Do I Need to Use a Semicolon VS a Slash in Oracle Sql
MySQL Query Finding Values in a Comma Separated String
How to Delete Duplicate Rows in SQL Server
Best Way to Do Multi-Row Insert in Oracle
Identity Increment Is Jumping in SQL Server Database
How to Request a Random Row in Sql
Accessing an Sqlite Database in Swift
Rails 4 Like Query - Activerecord Adds Quotes
Convert Datetime Column from Utc to Local Time in Select Statement