How to find maximum avg
Columns resulting from aggregate functions (e.g. avg) usually get arbitrary names. Just use an alias for it, and select on that:
select max(avg_salary)
from (select worker_id, avg(salary) AS avg_salary
from workers
group by worker_id) As maxSalary;
How to select MAX from AVG?
In SQL Server, you can order the records and use TOP 1
to keep only the record that has the highest amount:
SELECT TOP 1 Customer_id, AVG(Amount) AS [Average amount paid]
FROM Payment
GROUP BY customer_id
ORDER BY [Average amount paid] DESC;
Note: for this query to make sense, you need a GROUP BY
clause. Without it, it would just return one record, with the average of payments within the whole table.
How do I get the Average of maximum value?
most simple think you can do is (if in the employee table, you have a department_id) :
SELECT AVG(p.maximum)
FROM (SELECT department_id, MAX(salary) AS maximum
FROM employees
GROUP BY department_id) p
Find maximum average in sql
SELECT AVG(salary)
FROM employee
GROUP BY name
ORDER BY AVG(salary) DESC
LIMIT 1
How to quickly find the maximum-average interval?
There is an O(N*logC)
algorithm, where C
is proportional to the maximum element value of the array. Comparing with some more complicated algorithms in recent papers, this algorithm is easier to understand, and can be implemented in a short time, and still fast enough in practical.
For simplicity, We assume there is at least one non-negative integers in the array.
The algorithm is based on binary search. At first, we can find that the final answer must be in the range [0, max(A)]
, and we half this interval in each iteration, until it is small enough (10-6 for example). In each iteration, assume the available interval is [a,b]
, we need to check whether the maximum average is no less than (a+b)/2
. If so, we get a smaller interval [(a+b)/2, b]
, or else we get [a, (a+b)/2]
.
Now the problem is: Given a number K
, how to check that the final answer is at least K
?
Assume the average is at least K
, there exist some i
, j
such that (A[i] + A[i+1] + ... + A[j]) / (j - i + 1) >= K
. We multiply both sides by (j-i+1)
, and move the right side to left, and we get (A[i] - K) + (A[i+1] - K) + ... + (A[j] - K) >= 0
.
So, let B[i] = A[i] - K
, we only need to find an interval [i, j]
(j - i + 1 > L
) such that B[i] + ... + B[j] >= 0
. Now the problem is: Given array B
and length L
, we are to find an interval of maximum sum whose length is more than L
. If the maximum sum is >= 0
, the original average number K
is possible.
The second problem can be solved by linear scan. Let sumB[0] = 0
, sumB[i] = B[1] + B[2] + ... + B[i]
. For each index i
, the max-sum interval which ended at B[i]
is sumB[i] - min(sumB[0], sumB[1], ..., sumB[i-L-1])
. When scanning the array with increasing i
, we can maintain the min(sumB[0], ..., sumB[i-L-1])
on the fly.
The time complexity of the sub-problem is O(N)
. And we need O(logC)
iterations, so the total complexity is O(N*logC)
.
P.s. This kinds of "average problem" belongs to a family of problems called fractional programming. The similar problems are minimum average-weighted spanning tree, minimum average-weighted cycle, etc.
P.s. again. The O(logC)
is a loose bound. I think we can reduce it by some careful analysis.
Related Topics
SQL Statement with Multiple Sets and Wheres
How to Count Number of Occurrences for All Different Values in Database Column
Is There Any General Rule on SQL Query Complexity VS Performance
How to Insert Null Values into SQL Server
What Is the Fastest Way to Truncate Timestamps to 5 Minutes in Postgres
Sqlite Format Number with 2 Decimal Places Always
Is It Better to Create an Index Before Filling a Table with Data, or After the Data Is in Place
How to Find All Rows with a Null Value in Any Column Using Postgresql
Selecting Most Recent and Specific Version in Each Group of Records, for Multiple Groups
Postgres: Add Constraint If It Doesn't Already Exist
Is There a Coalesce-Like Function in Excel
Why Can't I Simply Add an Index That Includes All Columns
A How to Escape %% When Building Like Queries in Rails 3/Activerecord
Pyspark: Filter Dataframe Based on Multiple Conditions
Generate_Series in Postgres from Start and End Date in a Table