Why Is There No "Product()" Aggregate Function in Sql

Why is there no PRODUCT aggregate function in SQL?

There is no PRODUCT set function in the SQL Standard. It would appear to be a worthy candidate, though (unlike, say, a CONCATENATE set function: it's not a good fit for SQL e.g. the resulting data type would involve multivalues and pose a problem as regards first normal form).

The SQL Standards aim to consolidate functionality across SQL products circa 1990 and to provide 'thought leadership' on future development. In short, they document what SQL does and what SQL should do. The absence of PRODUCT set function suggests that in 1990 no vendor though it worthy of inclusion and there has been no academic interest in introducing it into the Standard.

Of course, vendors always have sought to add their own functionality, these days usually as extentions to Standards rather than tangentally. I don't recall seeing a PRODUCT set function (or even demand for one) in any of the SQL products I've used.

In any case, the work around is fairly simple using log and exp scalar functions (and logic to handle negatives) with the SUM set function; see @gbn's answer for some sample code. I've never needed to do this in a business application, though.

In conclusion, my best guess is that there is no demand from SQL end users for a PRODUCT set function; further, that anyone with an academic interest would probably find the workaround acceptable (i.e. would not value the syntactic sugar a PRODUCT set function would provide).

Out of interest, there is indeed demand in SQL Server Land for new set functions but for those of the window function variety (and Standard SQL, too). For more details, including how to get involved in further driving demand, see Itzik Ben-Gan's blog.

Why are aggregate functions not allowed in where clause

The reason you can't use SUM() in the WHERE clause is the order of evaluation of clauses.

FROM tells you where to read rows from. Right as rows are read from disk to memory, they are checked for the WHERE conditions. (Actually in many cases rows that fail the WHERE clause will not even be read from disk. "Conditions" are formally known as predicates and some predicates are used - by the query execution engine - to decide which rows are read from the base tables. These are called access predicates.) As you can see, the WHERE clause is applied to each row as it is presented to the engine.

On the other hand, aggregation is done only after all rows (that verify all the predicates) have been read.

Think about this: SUM() applies ONLY to the rows that satisfy the WHERE conditions. If you put SUM() in the WHERE clause, you are asking for circular logic. Does a new row pass the WHERE clause? How would I know? If it will pass, then I must include it in the SUM, but if not, it should not be included in the SUM. So how do I even evaluate the SUM condition?

Product() aggregate function

Oracle has exposed these ODCI (Oracle Data Cartridge Interface) methods to do the Aggregate! Reference:

  • ODCIAggregateDelete() - Removes an input value from the current group.
  • ODCIAggregateInitialize() - Initializes the aggregation context and instance of the implementation object type, and returns it as an OUT parameter.
  • ODCIAggregateIterate() - Iterates through input rows by processing the input values, updating and then returning the aggregation context.
  • ODCIAggregateMerge() - Merges two aggregation contexts into a single object instance during either serial or parallel evaluation of the user-defined aggregate.
  • ODCIAggregateTerminate() - Calculates the result of the aggregate computation and performs all necessary cleanup, such as freeing memory.
  • ODCIAggregateWrapContext() Integrates all external pieces of the current aggregation context to make the context self-contained.

Code For PRODUCT() Aggregate function :

CREATE OR REPLACE type PRODUCT_IMPL
AS
object
(
result NUMBER,
static FUNCTION ODCIAggregateInitialize(sctx IN OUT PRODUCT_IMPL)
RETURN NUMBER,
member FUNCTION ODCIAggregateIterate(self IN OUT PRODUCT_IMPL,
value IN NUMBER)
RETURN NUMBER,
member FUNCTION ODCIAggregateTerminate( self IN PRODUCT_IMPL,
returnValue OUT NUMBER,
flags IN NUMBER)
RETURN NUMBER,
member FUNCTION ODCIAggregateMerge(self IN OUT PRODUCT_IMPL,
ctx2 IN PRODUCT_IMPL )
RETURN NUMBER );
/
/* 1.Initializes the computation by initializing the aggregation context—the rows over which aggregation is performed: */
CREATE OR REPLACE type body PRODUCT_IMPL
IS
static FUNCTION ODCIAggregateInitialize(sctx IN OUT PRODUCT_IMPL)
RETURN NUMBER
IS
BEGIN
sctx := PRODUCT_IMPL(1);
RETURN ODCIConst.Success;
END;
/* 2.Iteratively processes each successive input value and updates the context: */
member FUNCTION ODCIAggregateIterate(self IN OUT PRODUCT_IMPL,
value IN NUMBER)
RETURN NUMBER
IS
BEGIN
self.result := value * self.result;
RETURN ODCIConst.Success;
END;
member FUNCTION ODCIAggregateTerminate(
self IN PRODUCT_IMPL,
returnValue OUT NUMBER,
flags IN NUMBER)
RETURN NUMBER
IS
BEGIN
returnValue := self.result;
RETURN ODCIConst.Success;
END;
member FUNCTION ODCIAggregateMerge(self IN OUT PRODUCT_IMPL,
ctx2 IN PRODUCT_IMPL)
RETURN NUMBER
IS
BEGIN
self.result := self.result;
RETURN ODCIConst.Success;
END;
END;
/

/* Create A function using the PRODUCT_IMPL implementation we did above */
CREATE OR REPLACE FUNCTION product(input NUMBER)
RETURN NUMBER
PARALLEL_ENABLE AGGREGATE USING PRODUCT_IMPL;
/

Results:

SELECT group_name,product(num) FROM product_test GROUP BY group_name;

Mahesh -60000
Mahesh_1 9

Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

Suppose I have the following table T:

a   b
--------
1 abc
1 def
1 ghi
2 jkl
2 mno
2 pqr

And I do the following query:

SELECT a, b
FROM T
GROUP BY a

The output should have two rows, one row where a=1 and a second row where a=2.

But what should the value of b show on each of these two rows? There are three possibilities in each case, and nothing in the query makes it clear which value to choose for b in each group. It's ambiguous.

This demonstrates the single-value rule, which prohibits the undefined results you get when you run a GROUP BY query, and you include any columns in the select-list that are neither part of the grouping criteria, nor appear in aggregate functions (SUM, MIN, MAX, etc.).

Fixing it might look like this:

SELECT a, MAX(b) AS x
FROM T
GROUP BY a

Now it's clear that you want the following result:

a   x
--------
1 ghi
2 pqr

You tried to execute a query that does not include the specified aggregate function

The error is because fName is included in the SELECT list, but is not included in a GROUP BY clause and is not part of an aggregate function (Count(), Min(), Max(), Sum(), etc.)

You can fix that problem by including fName in a GROUP BY. But then you will face the same issue with surname. So put both in the GROUP BY:

SELECT
fName,
surname,
Count(*) AS num_rows
FROM
author
INNER JOIN book
ON author.aID = book.authorID;
GROUP BY
fName,
surname

Note I used Count(*) where you wanted SUM(orders.quantity). However, orders isn't included in the FROM section of your query, so you must include it before you can Sum() one of its fields.

If you have Access available, build the query in the query designer. It can help you understand what features are possible and apply the correct Access SQL syntax.

must appear in the GROUP BY clause or be used in an aggregate function

Yes, this is a common aggregation problem. Before SQL3 (1999), the selected fields must appear in the GROUP BY clause[*].

To workaround this issue, you must calculate the aggregate in a sub-query and then join it with itself to get the additional columns you'd need to show:

SELECT m.cname, m.wmname, t.mx
FROM (
SELECT cname, MAX(avg) AS mx
FROM makerar
GROUP BY cname
) t JOIN makerar m ON m.cname = t.cname AND t.mx = m.avg
;

cname | wmname | mx
--------+--------+------------------------
canada | zoro | 2.0000000000000000
spain | usopp | 5.0000000000000000

But you may also use window functions, which looks simpler:

SELECT cname, wmname, MAX(avg) OVER (PARTITION BY cname) AS mx
FROM makerar
;

The only thing with this method is that it will show all records (window functions do not group). But it will show the correct (i.e. maxed at cname level) MAX for the country in each row, so it's up to you:

 cname  | wmname |          mx           
--------+--------+------------------------
canada | zoro | 2.0000000000000000
spain | luffy | 5.0000000000000000
spain | usopp | 5.0000000000000000

The solution, arguably less elegant, to show the only (cname, wmname) tuples matching the max value, is:

SELECT DISTINCT /* distinct here matters, because maybe there are various tuples for the same max value */
m.cname, m.wmname, t.avg AS mx
FROM (
SELECT cname, wmname, avg, ROW_NUMBER() OVER (PARTITION BY avg DESC) AS rn
FROM makerar
) t JOIN makerar m ON m.cname = t.cname AND m.wmname = t.wmname AND t.rn = 1
;

cname | wmname | mx
--------+--------+------------------------
canada | zoro | 2.0000000000000000
spain | usopp | 5.0000000000000000

[*]: Interestingly enough, even though the spec sort of allows to select non-grouped fields, major engines seem to not really like it. Oracle and SQLServer just don't allow this at all. Mysql used to allow it by default, but now since 5.7 the administrator needs to enable this option (ONLY_FULL_GROUP_BY) manually in the server configuration for this feature to be supported...

How to resolve nested aggregate function error?

looking to have each bucket shown as a % of the total

Combining COUNT(...) with windowed SUM() OVER() to get the total across all groups:

SELECT Case_Age_Category, 
DIV0(COUNT(Case_Age_Category), SUM(COUNT(Case_Age_Category)) OVER()) as Volume
FROM
(
-- ...
) sub
WHERE CLOSED_AT >= '2021-01-01 00:00:00.000'
GROUP BY Case_Age_Category
ORDER BY Case_Age_Category ASC

db<>fiddle demo

How to Calculate Aggregated Product Function in SQL Server

Oh, this is a pain. Most databases do not support a product aggregation function. You can emulate it with logs and powers. So, something like this might work:

select t.*,
(select exp(sum(log(serial)))
from table t2
where t2.no <= t.no
) as cumeProduct
from table t;

Note that log() might be called ln() in some databases. Also, this works for positive numbers. There are variations to handle negative numbers and zeroes, but this complicates the answer (and the sample data is all positive).

Misuse of aggregate function AVG() in SQL

I need to write the SQL query to return the number of employees for each department.

I assume you're looking for something like this:

SELECT department_id
,COUNT(employee_id) AS TotalEmployees
FROM Department
LEFT JOIN Employees
ON Employees.department_id = Department.department_id
GROUP BY department_id

Also, I need to return the employees salary that is higher than the average salary

The simplest way to return the salaries that are higher than average as a beginner sql programmer is probably something like this:

SELECT employee_salary
FROM Employees
WHERE employee_salary > (SELECT AVG(employee_salary)
FROM Employees)

As the others said, the other questions just require a bit of research. There are tonnes of resources out there to learn, but it takes time...



Related Topics



Leave a reply



Submit