Calculating Percentile Rankings in Ms SQL

Calculating percentile rank in MySQL

This is a relatively ugly answer, and I feel guilty saying it. That said, it might help you with your issue.

One way to determine the percentage would be to count all of the rows, and count the number of rows that are greater than the number you provided. You can calculate either greater or less than and take the inverse as necessary.

Create an index on your number.
total = select count();
less_equal = select count(
) where value > indexed_number;

The percentage would be something like: less_equal / total or (total - less_equal)/total

Make sure that both of them are using the index that you created. If they are not, tweak them until they are. The explain query should have "using index" in the right hand column. In the case of the select count(*) it should be using index for InnoDB and something like const for MyISAM. MyISAM will know this value at any time without having to calculate it.

If you needed to have the percentage stored in the database, you can use the setup from above for performance and then calculate the value for each row by using the second query as an inner select. The first query's value can be set as a constant.

Does this help?

Jacob

How to calculate 90th Percentile, SD, Mean for data in SQL

You can use the new suite of analytic functions introduced in SQL Server 2012:

SELECT DISTINCT
[Month],
Mean = AVG(Score) OVER (PARTITION BY [Month]),
StdDev = STDEV(Score) OVER (PARTITION BY [Month]),
P90 = PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY Score) OVER (PARTITION BY [Month])
FROM my_table

There are 2 percentile functions: PERCENTILE_CONT for continuous distribution and PERCENTILE_DISC for discrete distribution. Picks one that suits your needs.

Correct way of calculating percentile

Use DISTINCT

SELECT DISTINCT [col1], [col2], 
PERCENTILE_CONT(0.99)
WITHIN GROUP (ORDER BY [Values] ASC)
OVER (PARTITION BY [col1], [col2]) "Percentile_Cont"
FROM MyTable

MySQL - Calculating Percentile Ranks on the fly

Using Shlomi's code above, here's the code that I came up with to calculate percentile ranks (in case anyone wants to calculate these in the future):

SELECT 
c.id, c.score, ROUND(((@rank - rank) / @rank) * 100, 2) AS percentile_rank
FROM
(SELECT
*,
@prev:=@curr,
@curr:=a.score,
@rank:=IF(@prev = @curr, @rank, @rank + 1) AS rank
FROM
(SELECT id, score FROM mytable) AS a,
(SELECT @curr:= null, @prev:= null, @rank:= 0) AS b
ORDER BY score DESC) AS c;

Select count of max items, get rank and percentile

select
person_id,
count(*) over() as total_person,
rank() over(order by score desc) as score_rank
from (
select distinct on (person_id) *
from score
where category_id = 7
order by person_id, created desc
) s

Check rank, dense_rank, percent_rank, ntile, and cume_dist:

http://www.postgresql.org/docs/current/static/functions-window.html

distinct on returns a single row from each of the person_ids. Using the order by clause it is possible to choose each one.



Related Topics



Leave a reply



Submit