Simple way to calculate median with MySQL
In MariaDB / MySQL:
SELECT AVG(dd.val) as median_val
FROM (
SELECT d.val, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
FROM data d, (SELECT @rownum:=0) r
WHERE d.val is NOT NULL
-- put some where clause here
ORDER BY d.val
) as dd
WHERE dd.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) );
Steve Cohen points out, that after the first pass, @rownum will contain the total number of rows. This can be used to determine the median, so no second pass or join is needed.
Also AVG(dd.val)
and dd.row_number IN(...)
is used to correctly produce a median when there are an even number of records. Reasoning:
SELECT FLOOR((3+1)/2),FLOOR((3+2)/2); -- when total_rows is 3, avg rows 2 and 2
SELECT FLOOR((4+1)/2),FLOOR((4+2)/2); -- when total_rows is 4, avg rows 2 and 3
Finally, MariaDB 10.3.3+ contains a MEDIAN function
Calculating a simple median on a column in MySQL
I would just use distinct
, with an empty OVER()
clause:
SELECT DISTINCT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY my_column) OVER () median
FROM my_table
Calculating the Median with Mysql
val
is your time column, x
and y
are two references to the data table (you can write data AS x, data AS y
).
EDIT:
To avoid computing your sums twice, you can store the intermediate results.
CREATE TEMPORARY TABLE average_user_total_time
(SELECT SUM(time) AS time_taken
FROM scores
WHERE created_at >= '2010-10-10'
and created_at <= '2010-11-11'
GROUP BY user_id);
Then you can compute median over these values which are in a named table.
EDIT: Temporary table won't work here. You could try using a regular table with "MEMORY" table type. Or just have your subquery that computes the values for the median twice in your query. Apart from this, I don't see another solution. This doesn't mean there isn't a better way, maybe somebody else will come with an idea.
Simple way to calculate median with MySQL
In MariaDB / MySQL:
SELECT AVG(dd.val) as median_val
FROM (
SELECT d.val, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
FROM data d, (SELECT @rownum:=0) r
WHERE d.val is NOT NULL
-- put some where clause here
ORDER BY d.val
) as dd
WHERE dd.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) );
Steve Cohen points out, that after the first pass, @rownum will contain the total number of rows. This can be used to determine the median, so no second pass or join is needed.
Also AVG(dd.val)
and dd.row_number IN(...)
is used to correctly produce a median when there are an even number of records. Reasoning:
SELECT FLOOR((3+1)/2),FLOOR((3+2)/2); -- when total_rows is 3, avg rows 2 and 2
SELECT FLOOR((4+1)/2),FLOOR((4+2)/2); -- when total_rows is 4, avg rows 2 and 3
Finally, MariaDB 10.3.3+ contains a MEDIAN function
MySQL: Calculating Median of Values grouped by a Column
Your query computes row numbers using user variables, which makes it more complicated to handle partitions. Since you are using MySQL 8.0, I would suggest using window functions instead.
This should get you close to what you expect:
select
SchoolName,
avg(Marks) as median_val
from (
select
SchoolName,
Marks,
row_number() over(partition by SchoolName order by Marks) rn,
count(*) over(partition by SchoolName) cnt
from tablename
) as dd
where rn in ( FLOOR((cnt + 1) / 2), FLOOR( (cnt + 2) / 2) )
group by SchoolName
The arithmetic stays the same, but we are using window functions in groups of records having the same SchoolName
(instead of a global partition in your initial query). Then, the outer query filters and aggregate by SchoolName
.
In your DB Fiddlde, this returns:
| SchoolName | median_val |
| ---------- | ---------- |
| A | 71 |
| B | 254 |
| C | 344 |
| D | 233.5 |
Simple way to calculate median with MySQL
In MariaDB / MySQL:
SELECT AVG(dd.val) as median_val
FROM (
SELECT d.val, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
FROM data d, (SELECT @rownum:=0) r
WHERE d.val is NOT NULL
-- put some where clause here
ORDER BY d.val
) as dd
WHERE dd.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) );
Steve Cohen points out, that after the first pass, @rownum will contain the total number of rows. This can be used to determine the median, so no second pass or join is needed.
Also AVG(dd.val)
and dd.row_number IN(...)
is used to correctly produce a median when there are an even number of records. Reasoning:
SELECT FLOOR((3+1)/2),FLOOR((3+2)/2); -- when total_rows is 3, avg rows 2 and 2
SELECT FLOOR((4+1)/2),FLOOR((4+2)/2); -- when total_rows is 4, avg rows 2 and 3
Finally, MariaDB 10.3.3+ contains a MEDIAN function
How to calculate the median category wise in mysql
Assuming you reduce the set to the following. Note: id_student isn't required at this point in the calculation.
CREATE TABLE tscores (
id int primary key auto_increment
, region int
, id_student int
, total_score int
, index (region, total_score)
);
INSERT INTO tscores (region, id_student, total_score) VALUES
(1, 1000, 40)
, (1, 1001, 50)
, (1, 1002, 30)
, (1, 1003, 90)
, (2, 1101, 50)
, (2, 1102, 51)
, (2, 1103, 55)
;
SQL and Result:
WITH cte1 AS (
SELECT region, total_score
, ((COUNT(*) OVER (PARTITION BY region) + 1) / 2) AS n
, ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_score) AS rn
FROM tscores AS t
)
SELECT region
, truncate(AVG(total_score), 2) AS med_score
FROM cte1 AS t
WHERE rn IN (ceil(n), floor(n))
GROUP BY region
;
+--------+-----------+
| region | med_score |
+--------+-----------+
| 1 | 45.00 |
| 2 | 51.00 |
+--------+-----------+
2 rows in set (0.004 sec)
Still not quite enough detail. But here's SQL that runs against your schema, minus the typos I think you had in your SQL:
WITH tscores AS (
SELECT i.region AS region
, Sum(S.score) AS total_score
FROM tredence.assessments A
JOIN tredence.studentassessment S
ON A.id_assessment = S.id_assessment
JOIN tredence.studentinfo i
ON i.id_student = S.id_student
WHERE A.assessment = 'Exam'
GROUP BY S.id_student
, i.region
)
, cte1 AS (
SELECT region, total_score
, ((COUNT(*) OVER (PARTITION BY region) + 1) / 2) AS n
, ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_score) AS rn
FROM tscores AS t
)
SELECT region
, truncate(AVG(total_score), 2) AS med_score
FROM cte1 AS t
WHERE rn IN (ceil(n), floor(n))
GROUP BY region
;
Calculating the median with where clause condition - sqlite
One approach uses analytic functions:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Stay) rn,
COUNT(*) OVER () AS cnt,
AVG(Room_Spend + Food_Spend) OVER () AS total_spent
FROM test
)
SELECT AVG(Stay) AS Stay, MAX(total_spent) AS total_spent
FROM cte
WHERE rn = (cnt / 2) + 1 AND cnt % 2 = 1 OR
rn IN (cnt / 2, cnt / 2 + 1) AND cnt % 2 = 0;
Related Topics
The MySQL Extension Is Deprecated and Will Be Removed in the Future: Use MySQLi or Pdo Instead
How to Limit the Number of Rows Returned by an Oracle Query After Ordering
Optimize Group by Query to Retrieve Latest Row Per User
Ordering by the Order of Values in a SQL In() Clause
Xcode 4 and Core Data: How to Enable SQL Debugging
What Is the Most Efficient/Elegant Way to Parse a Flat Table into a Tree
MySQL How to Fill Missing Dates in Range
Best Approach to Remove Time Part of Datetime in SQL Server
How to Use Group by to Concatenate Strings in MySQL
How to Update Two Tables in One Statement in SQL Server 2005
How to Access the "Previous Row" Value in a Select Statement