Why does the following join increase the query time significantly?
Rewritten with (recommended) explicit ANSI JOIN syntax:
SELECT COUNT(impression_id), imp.os_id, os.os_desc
FROM bi.impressions imp
JOIN bi.os_desc os ON os.os_id = imp.os_id
GROUP BY imp.os_id, os.os_desc;
First of all, your second query might be wrong, if more or less than exactly one match are found in os_desc
for every row in impressions.
This can be ruled out if you have a foreign key constraint on os_id
in place, that guarantees referential integrity, plus a NOT NULL
constraint on bi.impressions.os_id
. If so, in a first step, simplify to:
SELECT COUNT(*) AS ct, imp.os_id, os.os_desc
FROM bi.impressions imp
JOIN bi.os_desc os USING (os_id)
GROUP BY imp.os_id, os.os_desc;
count(*)
is faster than count(column)
and equivalent here if the column is NOT NULL
. And add a column alias for the count.
Faster, yet:
SELECT os_id, os.os_desc, sub.ct
FROM (
SELECT os_id, COUNT(*) AS ct
FROM bi.impressions
GROUP BY 1
) sub
JOIN bi.os_desc os USING (os_id)
Aggregate first, join later. More here:
- Aggregate a single column in query with many columns
- PostgreSQL - order by an array
Why LEFT JOIN increase query time so much?
The 'small' left join is actually doing a lot of extra work for you. SQL Server has to go back to TABLE_Additional for each row from your inner join between and TABLE_Accounts_History and TABLE_For_Filtering. You can help SQL Server a few ways to speed this up by trying some indexing. You could:
1) Ensure TABLE_Accounts_History has an index on the Foreign Key H.[ACCOUNTSYS]
2) If you think that TABLE_Additional will always be accessed by the AccountSys, i.e. you will be requesting AccountSys in ordered groups, you could create a Clustered Index on TABLE_Additional.AccountSys. (in orther words physically order the table on disk in order of AccountSys)
3) You could also ensure there is a foreign key index on TABLE_Accounts_History.
Why does SELECT * INTO x FROM a JOIN b take significantly greater time than total time of SELECT COUNT(*) FROM a JOIN b & SELECT * INTO y FROM x?
It is I/O operations. The JOIN
has to process all the data rather than just the row counts. You are not taking this processing time into account.
Given the work that JOIN
has to do, an additional read/write of the data seems about right.
Related Topics
Translating SQL Joins on Foreign Keys to R Data.Table Syntax
Best Way in MySQL or Rails to Get Avg Per Day Within a Specific Date Range
Bigquery Date-Partitioned Views
Rodbc SQLquery() Returns Varchar(255) When It Should Return Varchar(Max)
How Does Select Top Works When No Order by Is Specified
SQL Server Variable Scope in a Stored Procedure
Conversion Failed When Converting the Nvarchar Value ... to Data Type Int
Is There a Quick Way to Check If Any Column Is Null
Using Case Statement Inside in Clause
How to Rewrite This SQL into Codeigniter's Active Records
Why Does the Following Join Increase the Query Time Significantly
With (Nolock) VS Set Transaction Isolation Level Read Uncommitted
How to Copy a Row from One SQL Server Table to Another
How to Simulate Deadlock on SQL Server