How to Compute Weighted Mean in R

How to calculate weighted means for each column (or row) of a matrix using the columns (or rows) from another matrix?

You may simply do

colSums(m*w)/colSums(w)  ## columns
# [1] 0.2519816 0.4546775 0.7812545
rowSums(m*w)/rowSums(w) ## rows
# [1] 0.2147437 0.5273465 1.0559481

which should be fastest.

Or, if you stick to weighted.mean(), you may use mapply.

mapply(weighted.mean, as.data.frame(m), as.data.frame(w), USE.NAMES=F)  ## columns
# [1] 0.2519816 0.4546775 0.7812545
mapply(weighted.mean, as.data.frame(t(m)), as.data.frame(t(w)), USE.NAMES=F) ## rows
# [1] 0.2147437 0.5273465 1.0559481

How to calculate weighted mean using mutate_at in R?

There is a lot to unpack here...

  1. Probably you mean summarise instead of mutate, because with mutate you would just replicate your result for each row.
  2. mutate_at and summarise_at are subseeded and you should use across instead.
  3. the reason why your code wasn't working was because you did not write your function as a formula (you did not add ~ at the beginning), also you were using df$Population instead of Population. When you write Population, summarise knows you're talking about the column Population which, at that point, is grouped like the rest of the dataframe. When you use df$Population you are calling the column of the original dataframe without grouping. Not only it is wrong, but you would also get an error because the length of the variable you are trying to average and the lengths of the weights provided by df$Population would not correspond.

Here is how you could do it:

library(dplyr)

df %>%
group_by(cz) %>%
summarise(across(vlist, weighted.mean, Population),
.groups = "drop")

If you really need to use summarise_at (and probably you are using an old version of dplyr [lower than 1.0.0]), then you could do:

df %>%
group_by(cz) %>%
summarise_at(vlist, ~weighted.mean(., Population)) %>%
ungroup()

I considered df and vlist like the following:

vlist <- c("Public_Welf_Total_Exp", "Welf_Cash_Total_Exp", "Welf_Cash_Cash_Assist", "Welf_Ins_Total_Exp","Total_Educ_Direct_Exp", "Higher_Ed_Total_Exp", "Welf_NEC_Cap_Outlay","Welf_NEC_Direct_Expend", "Welf_NEC_Total_Expend", "Total_Educ_Assist___Sub", "Health_Total_Expend", "Total_Hospital_Total_Exp", "Welf_Vend_Pmts_Medical","Hosp_Other_Total_Exp","Unemp_Comp_Total_Exp", "Unemp_Comp_Cash___Sec", "Total_Unemp_Rev", "Hous___Com_Total_Exp", "Hous___Com_Construct")
df <- as.data.frame(matrix(rnorm(length(vlist) * 100), ncol = length(vlist)))
names(df) <- vlist
df$cz <- rep(letters[1:10], each = 10)
df$Population <- runif(100)

How to calculate weighted average with Rstudio

Here's the dplyr version of your algorithm:

library(dplyr)
inv %>%
group_by(Date) %>%
mutate(
weight = Quantity / sum(Quantity),
) %>%
summarize(
result = sum(Quantity * weight)
)
# # A tibble: 2 × 2
# Date result
# <chr> <dbl>
# 1 2020-01-01 80.5
# 2 2020-02-02 54.2

Or we can use the built-in weighted.mean function directly for the same result:

inv %>%
group_by(Date) %>%
summarize(
result = weighted.mean(Quantity, w = Quantity / sum(Quantity))
)

If I'm misunderstanding the goal, please edit your question to show the desired output for the sample input.

Calculate the weighted mean for a variable, which has two types with R

You can try this using dcast from data.table. You can change fun.aggregate for the function that you need.

library(data.table)

dcast(data,
id_municipio ~ v0220,
fun.aggregate = mean,
value.var = "peso_amostral")

OUTPUT:

  id_municipio    1    2
1 1100015 3.98 3.37


Related Topics



Leave a reply



Submit