Nesting Queries in SQL

Nested select statement in SQL Server

You need to alias the subquery.

SELECT name FROM (SELECT name FROM agentinformation) a  

or to be more explicit

SELECT a.name FROM (SELECT name FROM agentinformation) a  

Avoiding Nested Queries

It really depends, I had situations where I improved some queries by using subqueries.

The factors that I am aware are:

  • if the subquery uses fields from outer query for comparison or not (correlated or not)
  • if the relation between the outer query and sub query is covered by indexes
  • if there are no usable indexes on the joins and the subquery is not correlated and returns a small result it might be faster to use it
  • i have also run into situations where transforming a query that uses order by into a query that does not use it and than turning it into a simple subquery and sort that improves performance in mysql

Anyway, it is always good to test different variants (with SQL_NO_CACHE please), and turning correlated queries into joins is a good practice.

I would even go so far to call it a very useful practice.

It might be possible that if correlated queries are the first that come to your mind that you are not primarily thinking in terms of set operations, but primarily in terms of procedural operations and when dealing with relational databases it is very useful to fully adopt the set perspective on the data model and transformations on it.

EDIT:
Procedural vs Relational
Thinking in terms of set operations vs procedural boils down to equivalence in some set algebra expressions, for example selection on a union is equivalent to union of selections. There is no difference between the two.

But when you compare the two procedures, such as apply the selection criteria to every element of an union with make a union and then apply selection, the two are distinctly different procedures, which might have very different properties (for example utilization of CPU, I/O, memory).

The idea behind relational databases is that you do not try to describe how to get the result (procedure), but only what you want, and that the database management system will decide on the best path (procedure) to fulfil your request. This is why SQL is called 4th generation language (4GL).

One of the tricks that help you do that is to remind yourself that tuples have no inherent order (set elements are unordered).
Another is realizing that relational algebra is quite comprehensive and allows translation of requests (requirements) directly to SQL (if semantics of your model represent well the problem space, or in another words if meaning attached to the name of your tables and relationships is done right, or in another words if your database is designed well).

Therefore, you do not have to think how, only what.

In your case, it was just preference over correlated queries, so it might be that I am not telling you anything new, but you emphasized that point, hence the comment.

I think that if you were completely comfortable with all the rules that transform queries from one form into another (rules such as distributiveness) that you would not prefer correlated subqueries (that you would see all forms as equal).

(Note: above discusses theoretical background, important for database design; practically the above concepts deviate - not all equivalent rewrites of a query are necessarily executed as fast, clustered primary keys do make tables have inherit order on disk, etc... but these deviations are only deviations; the fact that not all equivalent queries execute as fast is an imperfection of the actual DBMS and not the concepts behind it)

Nesting queries in SQL

If it has to be "nested", this would be one way, to get your job done:

SELECT o.name AS country, o.headofstate 
FROM country o
WHERE o.headofstate like 'A%'
AND (
SELECT i.population
FROM city i
WHERE i.id = o.capital
) > 100000

A JOIN would be more efficient than a correlated subquery, though. Can it be, that who ever gave you that task is not up to speed himself?

Nesting queries: best practices

Generally speaking:
Sub queries get executed for each row in the parent query for example

Select * from employees where name IN (select Manager_name from Team_project where project_id=1)

It would execute select manager_name for every row in the employees table to compare the names granted the query is cached which would make it faster, but it is still more work.

However, it all depends have a look at this discussion for more detail:
Subquery v/s inner join in sql server

SQL: Using the AND statement with nested queries

Just remove second second "Where" and your query is good to execute. You just need one where clause within which you can have all your condition combined with and, or etc..

SELECT EmployeeID, FirstName, LastName
FROM SQLTutorial.dbo.EmployeeDemographics

WHERE EmployeeID
IN (SELECT EmployeeID
FROM SQLTutorial.dbo.EmployeeSalary
WHERE JobTitle = 'DBA')

AND EmployeeID
IN (SELECT EmployeeID
FROM SQLTutorial.dbo.WareHouseEmployeeDemographics
WHERE Age = 29)

If you want all the employees whose ID is in BOTH EDemo AND ESalary AND their job title is DBA but or who work in the Warehouse AND are 29.

 SELECT EmployeeID, FirstName, LastName
FROM SQLTutorial.dbo.EmployeeDemographics

WHERE EmployeeID
IN (SELECT EmployeeID
FROM SQLTutorial.dbo.EmployeeSalary
WHERE JobTitle = 'DBA')

or EmployeeID
IN (SELECT EmployeeID
FROM SQLTutorial.dbo.WareHouseEmployeeDemographics
WHERE Age = 29)

SQL Nested Query with Store Function

Does this do what you want?

SELECT guestId 
FROM guest
WHERE user_id IN (
SELECT user_id
FROM users
WHERE type = 'Guest'
AND getDistance(lati, longi, ?, ?) <= ?)

SQL JOIN WITH NESTED QUERY

You can do this in a single query using conditional aggregation:

select Rmd.Issue_Id, 
max(case when As_Of_Date = '2021-08-08' then ish.Member_Impact end) as prev_member_impact,
max(case when As_Of_Date = '2021-08-15' then ish.Member_Impact end) as member_impact
from Lod.Ism_Issue_Summary_Hist_Wky ish Inner Join
Lod.Rmd_Iss_Remed_Summary ish
On ish.Issue_Id = Rmd.Issue_Id
where As_Of_Date in ('2021-08-08', '2021-08-015')
group By Rmd.Issue_Id
having max(case when As_Of_Date = '2021-08-08' then ish.Member_Impact end) <> max(case when As_Of_Date = '2021-08-15' then ish.Member_Impact end);

SQL Nested query on same table

First derive the average space across all records...

(Select avg(area) mavg from space)

This returns a single value that we can cross join to all space records and then just check for area > average.

Select * 
from Space
cross join (Select avg(area) mavg from space) Co
where area > Co.mavg

Since we know the result from the derived table/inline view will only 1 value every time, the cross join doesn't increase the number of rows being evaluated; and the rdbms only has to evaluate the average once.

However this assumes you want average across all records and not just across a company. If it is by company... then something like...

Select S.* 
from Space S
LEFT join (Select address, avg(area) mavg from space Group by address) Co
on S.address= Co.address
where S.area > Co.mavg

This determines the average by company joins back to space on that company and then compares each space record for the company to the company's average.

Since we don't know how you define company in terms of data, I just assumed a "address" field.

a different approach...

Select S.* 
from space S
where S.area > (Select avg(area) from space)

However this assumes average across all companies

or

Select S.*
from space
where s.avg > (Select avg(area) from space S2 where S.Company = S2.company)

If this doesn't do it I would need to see the DDL of the table space (The structure columns, data types PK, FKs etc) and some sample data.

unless address is the same for every company area.... there's must be some other criteria to relate a company to all the other company records (a name perhaps or a consistent key?)

Personally I find the use of a correlated sub query in this case slow as it has to calculate the average for every record in Space. The cross join IMO would be more efficient.



Related Topics



Leave a reply



Submit