How to Treat a Union Query as a Sub Query

How to use join and union for sub queries in mysql

You need to make a couple of changes in order for this query to work:

  • Wrap inner SELECT queries into another SELECT * query to make JOIN work, e.g. () SP LEFT JOIN () SP is not a valid syntax. Instead, use SELECT * FROM (..) SP JOIN (..) FP
  • Remove SP.* from outer SELECT as SP and FP are only visible to inner queries, use SELECT * instead
  • Remove WHERE SP.Service=FP.Service from outer WHERE clause as again, SP and FP won't be visible.

The below query should work:

SELECT *
FROM (
(
SELECT * FROM
(SELECT Service, SUM(Processed) as Second_Period, COUNT(Processed) as TRX_SP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-07 00:00:00' and '2017-05-14 00:00:00'
GROUP BY Service) SP
RIGHT JOIN
(SELECT Service, SUM(Processed) as First_Period, COUNT(Processed) as TRX_FP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-01 00:00:00' and '2017-05-06 00:00:00'
GROUP BY Service) FP USING (Service))
UNION ALL
(
SELECT * FROM
(SELECT Service, SUM(Processed) as Second_Period, COUNT(Processed) as TRX_SP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-07 00:00:00' and '2017-05-14 00:00:00'
GROUP BY Service) SP
LEFT OUTER JOIN
(SELECT Service, SUM(Processed) as First_Period, COUNT(Processed) as TRX_FP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-01 00:00:00' and '2017-05-06 00:00:00'
GROUP BY Service) FP USING (Service) )) as tbl2
GROUP BY Service
Order BY Service

update

You can't use function in USING clause, so you need to alias that column and use the alias in USING, e.g.:

SELECT *
FROM (
(
SELECT * FROM
(SELECT DAYNAME(Dataime) as 'day', SUM(Processed) as Second_Period, COUNT(Processed) as TRX_SP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-07 00:00:00' and '2017-05-14 00:00:00'
GROUP BY DAYNAME(Dataime)) SP
RIGHT JOIN
(SELECT DAYNAME(Dataime) as 'day', SUM(Processed) as First_Period, COUNT(Processed) as TRX_FP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-01 00:00:00' and '2017-05-06 00:00:00'
GROUP BY DAYNAME(Dataime)) FP USING(`day`))
UNION ALL
(
SELECT * FROM
(SELECT DAYNAME(Dataime) as `day`, SUM(Processed) as Second_Period, COUNT(Processed) as TRX_SP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-07 00:00:00' and '2017-05-14 00:00:00'
GROUP BY DAYNAME(Dataime)) SP
LEFT OUTER JOIN
(SELECT DAYNAME(Dataime) as `day`, SUM(Processed) as First_Period, COUNT(Processed) as TRX_FP
FROM pay
WHERE Status1='Processed' AND Dataime BETWEEN '2017-05-01 00:00:00' and '2017-05-06 00:00:00'
GROUP BY DAYNAME(Dataime)) FP USING (`day`) )) as tbl2
GROUP BY `day`
Order BY `day`

Selecting with UNION but limiting every subquery and receiving distinct values

If necessary, replace "id" with the name of your primary key:

(SELECT * FROM table1 WHERE criteria ORDER BY views DESC LIMIT 15)
UNION
(SELECT * FROM table1 WHERE criteria AND id NOT IN(SELECT id FROM table1 WHERE criteria LIMIT 15) ORDER BY date_upload DESC LIMIT 15)

This query:

- selects the top 15 records matching criteria ordered by views

- selects the top 15 matching criteria and not in the first SELECT, and orders them by date_upload

With this query you will be sure to get 30 records every time 30 distinct records are available in table1.

SQL: Optimize nested query with redundant union code?

If you don't like repeating code you could use common table expression:

WITH cte AS (
SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789'
)
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE clientLocalID IN (SELECT HierarchyItem FROM cte)
UNION
SELECT ClientMasterName, '0', clientMasterID, 2
FROM dbo.ClientMaster
WHERE clientMasterID IN (SELECT HierarchyItem FROM cte)
ORDER BY HierarchyType ASC;

Use combination of sql union sub-queries to handle combinations of AND/OR conditions for customer filtering

If you are only ever going to be querying rows from the current day, it makes sense to have an overnight job that sets up a new partition on each table for the current day (and moves the previous day's rows into the main partition for each table). That way, you should only be querying thousands, not millions, of records with each query.

If the date/time range can potentially be from any day, then it would make more sense to set up a new index on each table, either on UPDATED alone or on a combination of UPDATED and PARTY_ID. If you have the opportunity, I recommend trying your updated query against both sets of indexes, and seeing how they compare.

One approach to implementing the more complex functionality required might be by summing conditional expressions - for example, in the case where you want PARTY_ID in TABLE1 and TABLE2 and TABLE3 or TABLE4 and TABLE5:

select PARTY_ID from
(select distinct PARTY_ID, 'TABLE1' TABLENAME from TABLE1
WHERE UPDATED >= '2015-09-18 14:43:44' AND UPDATED <= '2015-09-18 15:00:00' union all
select distinct PARTY_ID, 'TABLE2' TABLENAME from TABLE2
WHERE UPDATED >= '2015-09-18 14:43:44' AND UPDATED <= '2015-09-18 15:00:00' union all
select distinct PARTY_ID, 'TABLE3' TABLENAME from TABLE3
WHERE UPDATED >= '2015-09-18 14:43:44' AND UPDATED <= '2015-09-18 15:00:00' union all
select distinct PARTY_ID, 'TABLE4' TABLENAME from TABLE4
WHERE UPDATED >= '2015-09-18 14:43:44' AND UPDATED <= '2015-09-18 15:00:00' union all
select distinct PARTY_ID, 'TABLE5' TABLENAME from TABLE5
WHERE UPDATED >= '2015-09-18 14:43:44' AND UPDATED <= '2015-09-18 15:00:00') as ilv
GROUP BY PARTY_ID
HAVING SUM(CASE WHEN TABLENAME IN ('TABLE1','TABLE2','TABLE3') THEN 1 END)=3
OR SUM(CASE WHEN TABLENAME IN ('TABLE4','TABLE5') THEN 1 END)=2;

Note that the numbers in the equality expressions (in the HAVING clause) need to match the total number of tables being checked in the CASE expression - so the query needs to check whether the SUMmed CASE expression equals 3 when checking that the PARTY_ID is in all of the first three tables, and needs to check whether the second expression equals 2 when checking the last two tables.

Using UNION sub query inside of NOT IN MySQL

I would use an exists clause, two of them actually:

SELECT s.student_id, s.student_name
FROM students s
WHERE NOT EXISTS (SELECT 1 FROM free_students fs WHERE s.student_id = fs.student_id) AND
NOT EXISTS (SELECT 1 FROM paid_students ps WHERE s.student_id = ps.student_id);

BigQuery: Subquery with UNION as ARRAY

Using CROSS JOINS,

SELECT o.*, ARRAY_AGG(i) invoices
FROM Orders o, Invoices i
WHERE o.order_id = i.order_id
OR i.linked_order_id IN (o.linked_order1, o.linked_order2)
GROUP BY 1, 2, 3;
Query results

Sample Image

[UPDATE]

Sometimes the query using OR conditions in WHERE clause might show poor perfomrance in large dataset. In that case you may try below query instead that generates same result.

SELECT o.*, ARRAY_AGG(i) invoices FROM (
SELECT o, i FROM Orders o JOIN Invoices i USING (order_id)
UNION ALL
SELECT o, i FROM Orders o JOIN Invoices i ON i.linked_order_id IN (o.linked_order1, o.linked_order2)
) GROUP BY 1, 2, 3;

Do I need a subquery when doing DISTRIBUTE BY in a UNION query?

From the Hive docs:

To apply ORDER BY, SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT to an
individual SELECT, place the clause inside the parentheses that
enclose the SELECT:

SELECT key FROM (SELECT key FROM src ORDER BY key LIMIT 10) subq1 
UNION
SELECT key FROM (SELECT key FROM src1 ORDER BY key LIMIT 10) subq2

To apply an ORDER BY, SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT
clause to the entire UNION result, place the ORDER BY, SORT BY,
CLUSTER BY, DISTRIBUTE BY or LIMIT after the last one.

The following
example uses both ORDER BY and LIMIT clauses:

SELECT key FROM src
UNION
SELECT key FROM src1
ORDER BY key LIMIT 10

So in your case the DISTRIBUTE BY clause will be applied to the whole result of UNION.



Related Topics



Leave a reply



Submit