Aggregate Function in SQL Where-Clause

Aggregate function in SQL WHERE-Clause

You haven't mentioned the DBMS. Assuming you are using MS SQL-Server, I've found a T-SQL Error message that is self-explanatory:

"An aggregate may not appear in the
WHERE clause unless it is in a
subquery contained in a HAVING clause
or a select list, and the column being
aggregated is an outer reference"

http://www.sql-server-performance.com/

And an example that it is possible in a subquery.

Show all customers and smallest order for those who have 5 or more orders (and NULL for others):

SELECT a.lastname
     , a.firstname
     , ( SELECT MIN( o.amount )
         FROM orders o
         WHERE a.customerid = o.customerid
           AND COUNT( a.customerid ) >= 5
        )
        AS smallestOrderAmount
FROM account a
GROUP BY a.customerid
       , a.lastname
       , a.firstname ;

UPDATE.

The above runs in both SQL-Server and MySQL but it doesn't return the result I expected. The next one is more close. I guess it has to do with that the field customerid, GROUPed BY and used in the query-subquery join is in the first case PRIMARY KEY of the outer table and in the second case it's not.

Show all customer ids and number of orders for those who have 5 or more orders (and NULL for others):

SELECT o.customerid
     , ( SELECT COUNT( o.customerid )
         FROM account a
         WHERE a.customerid = o.customerid
           AND COUNT( o.customerid ) >= 5
        )
        AS cnt
FROM orders o
GROUP BY o.customerid ;

Why are aggregate functions not allowed in where clause

The reason you can't use SUM() in the WHERE clause is the order of evaluation of clauses.

FROM tells you where to read rows from. Right as rows are read from disk to memory, they are checked for the WHERE conditions. (Actually in many cases rows that fail the WHERE clause will not even be read from disk. "Conditions" are formally known as predicates and some predicates are used - by the query execution engine - to decide which rows are read from the base tables. These are called access predicates.) As you can see, the WHERE clause is applied to each row as it is presented to the engine.

On the other hand, aggregation is done only after all rows (that verify all the predicates) have been read.

Think about this: SUM() applies ONLY to the rows that satisfy the WHERE conditions. If you put SUM() in the WHERE clause, you are asking for circular logic. Does a new row pass the WHERE clause? How would I know? If it will pass, then I must include it in the SUM, but if not, it should not be included in the SUM. So how do I even evaluate the SUM condition?

SQL query aggregate function in WHERE clause

Try this

SELECT table1.col1, avg(datediff(dd, table2.date, table3.date)) as avg_date
INNER JOIN table2
INNER JOIN table3
HAVING avg_date <= (SELECT avg_date from
(SELECT table1.col1, avg(datediff(dd, table2.date, table3.date)) as avg_date
INNER JOIN table2
INNER JOIN table3
GROUP BY table1.col1)
);

Alternative:

SELECT * FROM
(
    SELECT table1.col1, avg(datediff(dd, table2.date, table3.date)) as avg_date
    INNER JOIN table2
    INNER JOIN table3
) outer_table
WHERE avg_date <= ALL(SELECT avg_date from(SELECT table1.col1,avg(datediff(dd, table2.date, table3.date)) as avg_date
INNER JOIN table2
INNER JOIN table3
GROUP BY table1.col1));

Edited for SQL Server

SELECT * FROM
(
    SELECT table1.col1, avg(datediff(dd, table2.date, table3.date)) as avg_date
    INNER JOIN table2
    INNER JOIN table3
) outer_table
WHERE avg_date <= ALL(SELECT inner_table.avg_date from
(SELECT table1.col1,avg(datediff(dd, table2.date, table3.date)) as avg_date
INNER JOIN table2
INNER JOIN table3
GROUP BY table1.col1) inner_table);

Aggregate column from CTE cannot be used in WHERE clause in the query in PostgreSQL

The t2.count is being interpreted as an aggregate COUNT() function, and your t2 table does not have a column called count.

Make sure that your table does actually have a count column, or make sure to compute it's aggregate count on another CTE before joining, and then comparing the the results. Also avoid using the alias "count", like the following:

WITH CTE AS (
    SELECT t1.x, COUNT(t1.y) AS total
    FROM table1 t1
    GROUP BY t1.x
),
CTE2 AS (
    SELECT t2.x, COUNT(t2.y) AS total
    FROM table2 t2
    GROUP BY t2.x
)
SELECT 
    CTE1.x, 
    CTE1.total AS newCount, 
    CTE2.total AS oldCount
FROM 
    CTE2 
    JOIN CTE1 ON CTE2.x = CTE1.x
WHERE 
    CTE2.total != CTE1.total;

Aggregate Function in SQL Where-Clause