Calculating percentages with GROUP BY query
WITH t1 AS
(SELECT User, Rating, Count(*) AS n
FROM your_table
GROUP BY User, Rating)
SELECT User, Rating, n,
(0.0+n)/(COUNT(*) OVER (PARTITION BY User)) -- no integer divide!
FROM t1;
Or
SELECT User, Rating, Count(*) OVER w_user_rating AS n,
(0.0+Count(*) OVER w_user_rating)/(Count(*) OVER (PARTITION BY User)) AS pct
FROM your_table
WINDOW w_user_rating AS (PARTITION BY User, Rating);
I would see if one of these or the other yields a better query plan with the appropriate tool for your RDBMS.
Calculating percentage within a group
You can do it with a sub-select and a join:
SELECT t1.sex, employed, count(*) AS `count`, count(*) / t2.total AS percent
FROM my_table AS t1
JOIN (
SELECT sex, count(*) AS total
FROM my_table
GROUP BY sex
) AS t2
ON t1.sex = t2.sex
GROUP BY t1.sex, employed;
I can't think of other approaches off the top of my head.
SQL - Calculate percentage by group, for multiple groups
SELECT
SIGN(Orders),
ROUND(COUNT(*) * 100.000 / SUM(COUNT(*), 2) OVER (PARTITION BY Month)) AS Parts,
Month
FROM T
GROUP BY Month, SIGN(Orders)
ORDER BY Month, SIGN(Orders)
Demo on Postgres:
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=4cd2d1455673469c2dfc060eccea8020
You've stated that it's important for the total to be 100% so you might consider rounding down in the case of no orders and rounding up in the case of has orders for those scenarios where the percentages falls precisely on an odd multiple of 0.5%. Or perhaps rounding toward even or round smallest down would be better options:
WITH DATA AS (
SELECT SIGN(Orders) AS HasOrders, Month,
COUNT(*) * 10000.000 / SUM(COUNT(*)) OVER (PARTITION BY Month) AS PartsPercent
FROM T
GROUP BY Month, SIGN(Orders)
ORDER BY Month, SIGN(Orders)
)
select HasOrders, Month, PartsPercent,
PartsPercent - TRUNCATE(PartsPercent) AS Fraction,
CASE WHEN HasOrders = 0
THEN FLOOR(PartsPercent) ELSE CEILING(PartsPercent)
END AS PartsRound0Down,
CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5
AND MOD(TRUNCATE(PartsPercent), 2) = 0
THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
END AS PartsRoundTowardEven,
CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5 AND PartsPercent < 50
THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
END AS PartsSmallestTowardZero
from DATA
It's usually not advisable to test floating-point values for equality and I don't know how BigQuery's float64
will work with the comparison against 0.5. One half is nevertheless representable in binary. See these in a case where the breakout is 101 vs 99. I don't have immediate access to BigQuery so be aware that Postgres's rounding behavior is different:
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=c8237e272427a0d1114c3d8056a01a09
Calculate percentage within a subgroup in R
You first group by country to get the sum for each country. Then you group by country and motiv and use the sum for each country to calculate your frequency.
am2 %>%
group_by(country) %>%
mutate(sum_country = sum(number)) %>%
group_by(country, motif) %>%
mutate(freq = number/sum_country,
freq_perc = freq*100 %>% round(2))
ggplot2
example:
df <- am2 %>%
group_by(country) %>%
mutate(sum_country = sum(number)) %>%
group_by(country, motif) %>%
mutate(freq = number/sum_country,
freq_perc = freq*100 %>% round(2))
library(ggplot2)
df %>%
ggplot( aes(x=country, y=freq_perc, fill=motif)) +
geom_bar(stat="identity", position="dodge")
How to calculate percentage in group by with condition?
You can do conditional aggregation. I like to do this with avg()
:
select provider, avg(status = 'failure') failure_ratio
from mytable
group by provider
For each provider
, this gives you a numeric value between 0
and 1
that represents the ratio of records in 'failure'
status
. You can multiply that by 100
if you want a percentage instead.
Using dplyr function to calculate percentage within groups
library(dplyr)
df %>%
# line below to freeze order of type_n if not ordered factor already
mutate(type_n = forcats::fct_inorder(type_n)) %>%
group_by(type_n) %>%
summarize(n = n(), total = sum(population)) %>%
mutate(new_col = (n / total) %>% scales::percent(decimal.mark = ",", suffix = ""))
# A tibble: 3 x 4
type_n n total new_col
<fct> <int> <int> <chr>
1 small 2 7 28,6
2 medium 2 14 14,3
3 large 3 15 20,0
How to calculate the percentage of values with grouping
If the column type has only the values 0 and 1:
SELECT
`group`,
round(100.0 * sum(type) / count(*), 0) as percentage
FROM country
GROUP BY `group`;
See the demo.
Results:
| group | percentage |
| ----- | ---------- |
| 1 | 50 |
| 2 | 67 |
| 3 | 0 |
Related Topics
Odata Case In-Sensitive Filtering in Web API
Suppress Output of Variables Substitution in SQLplus
Oracle Insert into Table2 Then Delete from Table1, Exception If Fail
Should the Data Access Layer Contain Business Logic
Sql-Only Find Time and Not Date in Access Date/Time Field
Lock Escalation - What's Happening Here
How to Update in SQLite Using a Left Join to Select Candidate Rows
Oracle Regexp_Substr | Fetch String Between Two Delimiters
Connect to SQL via Windows Authentication Over Vpn
Combine Multiple Rows into Multiple Columns Dynamically in SQL Server
How to Update Ms Access Database Table Using Update and Sum() Function
Pass Multiple Sets or Arrays of Values to a Function
How to Find the Size of an Array in Postgresql
How to Parse JSON in Oracle SQL? (Version:11.2.0)
Access SQL Query: Find the Most Recent Date Entry for Each Employee for Each Training Course
Get Data Type of Field in Select Statement in Oracle
How to Perform a Left Join in SQL Server Between Two Select Statements