How to Re-Use Result for Select, Where and Order by Clauses

How to re-use result for SELECT, WHERE and ORDER BY clauses?

In the GROUP BY and ORDER BY clause you can refer to column aliases (output columns) or even ordinal numbers of SELECT list items. I quote the manual on ORDER BY:

Each expression can be the name or ordinal number of an output column
(SELECT list item), or it can be an arbitrary expression formed from
input-column values.

Bold emphasis mine.

But in the WHERE and HAVING clauses, you can only refer to columns from the base tables (input columns), so you have to spell out your function call.

SELECT *, earth_distance(ll_to_earth(62.0, 25.0), ll_to_earth(lat, lon)) AS dist
FROM   venues 
WHERE  earth_distance(ll_to_earth(62.0, 25.0), ll_to_earth(lat, lon)) <= radius 
ORDER  BY distance;

If you want to know if it's faster to pack the calculation into a CTE or subquery, just test it with EXPLAIN ANALYZE. (I doubt it.)

SELECT *
FROM  (
   SELECT *
         ,earth_distance(ll_to_earth(62.0, 25.0), ll_to_earth(lat, lon)) AS dist
   FROM   venues
   ) x
WHERE  distance <= radius 
ORDER  BY distance;

Like @Mike commented, by declaring a function STABLE (or IMMUTABLE) you inform the query planner that results from a function call can be reused multiple times for identical calls within a single statement. I quote the manual here:

A STABLE function cannot modify the database and is guaranteed to
return the same results given the same arguments for all rows within a
single statement. This category allows the optimizer to optimize
multiple calls of the function to a single call.

Bold emphasis mine.

How can I reuse a calculated CASE column in the WHERE clause?

In SQL Server you are unable to use a calculated column in the WHERE clause, only in the ORDER BY clause.

So you either need a sub-query of some form or you have to repeat the calculation. CROSS APPLY is a neat way to accomplish this.

SELECT T1.id, X.AmberRed
FROM db.tbl1 T1
CROSS APPLY (VALUES (
    CASE WHEN T1.id LIKE 'A_SEQ%' THEN 'amber'
    WHEN TestReason = 'itf' THEN 'red'
    ELSE NULL END
)) AS X (AmberRed)
WHERE X.AmberRed IS NOT NULL;

But a simple sub-query will also do the job

SELECT X.id, X.AmberRed
FROM (
    SELECT T1.id
        , CASE WHEN T1.id LIKE 'A_SEQ%' THEN 'amber'
        WHEN TestReason = 'itf' THEN 'red'
        ELSE NULL END
    FROM db.tbl1 T1
) X
WHERE X.AmberRed IS NOT NULL;

Or you can even just repeat the expression if its simple:

SELECT T1.id
    , CASE WHEN T1.id LIKE 'A_SEQ%' THEN 'amber'
    WHEN TestReason = 'itf' THEN 'red'
    ELSE NULL END
FROM db.tbl1 T1
WHERE CASE WHEN T1.id LIKE 'A_SEQ%' THEN 'amber'
    WHEN TestReason = 'itf' THEN 'red'
    ELSE NULL END IS NOT NULL;

Save results of a select clause in a query to reuse in the same query?

Just add an extra query level to make use of any sub-query results e.g.

Also I have corrected a syntax error at as v(col)

select id, NumBananas
from (
  select Menu.id, (
      select count(*)
      from (values (Menu.item1), (Menu.item2), (Menu.item3)) as v(col)
      where v.col = 'banana'
    ) NumBananas
  from Menu
) X
where X.NumBananas = 2;

reuse the result of a select expression in the GROUP BY clause?

Is this construct possible in Spark SQL?

Yes, It is. You can make it works in Spark SQL in 2 ways to use new column in GROUP BY and ORDER BY clauses

Approach 1 using sub query :

SELECT timeHour, someThing FROM (SELECT  
      from_unixtime((starttime/1000)) AS timeHour
    , sum(...)                          AS someThing
    , starttime
FROM
    some_table) 
WHERE
    starttime >= 1000*unix_timestamp('2017-09-16 00:00:00')
      AND starttime <= 1000*unix_timestamp('2017-09-16 04:00:00')
GROUP BY
    timeHour
ORDER BY
    timeHour
LIMIT 10;

Approach 2 using WITH // elegant way :

-- create alias 
WITH table_aliase AS(SELECT  
      from_unixtime((starttime/1000)) AS timeHour
    , sum(...)                          AS someThing
    , starttime
FROM
    some_table)

-- use the same alias as table
SELECT timeHour, someThing FROM table_aliase
WHERE
    starttime >= 1000*unix_timestamp('2017-09-16 00:00:00')
      AND starttime <= 1000*unix_timestamp('2017-09-16 04:00:00')
GROUP BY
    timeHour
ORDER BY
    timeHour
LIMIT 10;

Alternative using Spark DataFrame(wo SQL) API with Scala :

// This code may need additional import to work well

val df = .... //load the actual table as df

import org.apache.spark.sql.functions._

df.withColumn("timeHour", from_unixtime($"starttime"/1000))
  .groupBy($"timeHour")
  .agg(sum("...").as("someThing"))
  .orderBy($"timeHour")
  .show()

//another way - as per eliasah comment
df.groupBy(from_unixtime($"starttime"/1000).as("timeHour"))
  .agg(sum("...").as("someThing"))
  .orderBy($"timeHour")
  .show()

Reuse Where Condition throughout Stored Procedure?

If you have individual select statements with no relation between them then this might help you . Prepare your select queries and Where queries in separate variables and concatenate them together before you execute it .

DECLARE @UserId UNIQUEIDENTIFIER = '96105876-AB55-4D28-A8DD-6BFEC9D38EF8'

DECLARE @Query NVARCHAR(MAX) , @SelectQuery NVARCHAR(MAX), @WhereQuery NVARCHAR(MAX)

SET @WhereQuery = 'WHERE U.Id = @UserId AND U.Active = 1 AND U.Deleted = 0'

 -- Query 1 

SET @SelectQuery =
'SELECT 
    CD.[Name],
    CD.[ASIN],
    CD.[Category],
    CD.[Quantity],
    CD.[Total],
    CD.[Weight]
FROM CartDetails CD
INNER JOIN Cart C ON C.Id = CD.CartId 
INNER JOIN Users U ON U.Id = C.UserId'

SET @Query = @SelectQuery + @WhereQuery

EXEC (@Query)

--- Query 2

SET @Query = NULL 

SET @SelectQuery = NULL

SET @SelectQuery ='
SELECT  
    DCC.DiscountAmount, 
    DCC.DiscountCouponId
FROM DiscountCouponConsumed DCC
INNER JOIN Users U ON U.Id = DCC.UserId '

SET @Query = @SelectQuery + @WhereQuery

EXEC (@Query)

Reuse calculated column in WHERE clause

There is no way to reuse the calculated field on the same level SELECT. You will need to nest it in order to use the alias.

SELECT field1
    , calc_field
FROM (
    SELECT field1
        , CONCAT (field2, field3) AS calc_field
    FROM MyTable
    ) tbl
WHERE calc_field LIKE 'A%'

This is because of the order in which clauses are executed in a SQL query. As you can see in the way the clauses are listed, the SELECT clause, where the alias is generated, is executed after the WHERE clause.

Thus, the alias is not "visible" in the WHERE clause, because the alias is generated after the WHERE is applied.

How to reuse variable from SQL SELECT statement in WHERE clause

Assuming you do not need @DISTANCE outside the select, here's an approach to try:

    SELECT *,
            6371 * 2 * asin(sqrt(POW(sin(({lat} - radians(address.latitude)) / 2), 2) + cos({lat}) * cos(radians(address.latitude)) * POW(sin(({lon} - radians(address.longitude)) / 2), 2))) AS DISTANCE
    FROM    service,
            provider,
            address 
    WHERE
        service.provider_id = provider.id 
        AND provider.address_id = address.id 
        AND provider.status = True 
    HAVING 
        DISTANCE < 10 
    ORDER BY
        DISTANCE

Note that a I have given the calculation an alias DISTANCE(see far right of select) and used HAVING rather than WHERE for the reference to the alias.

How to Re-Use Result for Select, Where and Order by Clauses