How to calculate weighted means for each column (or row) of a matrix using the columns (or rows) from another matrix?
You may simply do
colSums(m*w)/colSums(w) ## columns
# [1] 0.2519816 0.4546775 0.7812545
rowSums(m*w)/rowSums(w) ## rows
# [1] 0.2147437 0.5273465 1.0559481
which should be fastest.
Or, if you stick to weighted.mean()
, you may use mapply
.
mapply(weighted.mean, as.data.frame(m), as.data.frame(w), USE.NAMES=F) ## columns
# [1] 0.2519816 0.4546775 0.7812545
mapply(weighted.mean, as.data.frame(t(m)), as.data.frame(t(w)), USE.NAMES=F) ## rows
# [1] 0.2147437 0.5273465 1.0559481
How to calculate weighted mean using mutate_at in R?
There is a lot to unpack here...
- Probably you mean
summarise
instead ofmutate
, because withmutate
you would just replicate your result for each row. mutate_at
andsummarise_at
are subseeded and you should useacross
instead.- the reason why your code wasn't working was because you did not write your function as a formula (you did not add
~
at the beginning), also you were usingdf$Population
instead ofPopulation
. When you writePopulation
,summarise
knows you're talking about the columnPopulation
which, at that point, is grouped like the rest of the dataframe. When you usedf$Population
you are calling the column of the original dataframe without grouping. Not only it is wrong, but you would also get an error because the length of the variable you are trying to average and the lengths of the weights provided bydf$Population
would not correspond.
Here is how you could do it:
library(dplyr)
df %>%
group_by(cz) %>%
summarise(across(vlist, weighted.mean, Population),
.groups = "drop")
If you really need to use summarise_at
(and probably you are using an old version of dplyr
[lower than 1.0.0]), then you could do:
df %>%
group_by(cz) %>%
summarise_at(vlist, ~weighted.mean(., Population)) %>%
ungroup()
I considered df
and vlist
like the following:
vlist <- c("Public_Welf_Total_Exp", "Welf_Cash_Total_Exp", "Welf_Cash_Cash_Assist", "Welf_Ins_Total_Exp","Total_Educ_Direct_Exp", "Higher_Ed_Total_Exp", "Welf_NEC_Cap_Outlay","Welf_NEC_Direct_Expend", "Welf_NEC_Total_Expend", "Total_Educ_Assist___Sub", "Health_Total_Expend", "Total_Hospital_Total_Exp", "Welf_Vend_Pmts_Medical","Hosp_Other_Total_Exp","Unemp_Comp_Total_Exp", "Unemp_Comp_Cash___Sec", "Total_Unemp_Rev", "Hous___Com_Total_Exp", "Hous___Com_Construct")
df <- as.data.frame(matrix(rnorm(length(vlist) * 100), ncol = length(vlist)))
names(df) <- vlist
df$cz <- rep(letters[1:10], each = 10)
df$Population <- runif(100)
How to calculate weighted average with Rstudio
Here's the dplyr
version of your algorithm:
library(dplyr)
inv %>%
group_by(Date) %>%
mutate(
weight = Quantity / sum(Quantity),
) %>%
summarize(
result = sum(Quantity * weight)
)
# # A tibble: 2 × 2
# Date result
# <chr> <dbl>
# 1 2020-01-01 80.5
# 2 2020-02-02 54.2
Or we can use the built-in weighted.mean
function directly for the same result:
inv %>%
group_by(Date) %>%
summarize(
result = weighted.mean(Quantity, w = Quantity / sum(Quantity))
)
If I'm misunderstanding the goal, please edit your question to show the desired output for the sample input.
Calculate the weighted mean for a variable, which has two types with R
You can try this using dcast
from data.table
. You can change fun.aggregate
for the function that you need.
library(data.table)
dcast(data,
id_municipio ~ v0220,
fun.aggregate = mean,
value.var = "peso_amostral")
OUTPUT:
id_municipio 1 2
1 1100015 3.98 3.37
Related Topics
Boxplot of Table Using Ggplot2
Create Link to the Other Part of the Shiny App
Calling Library() in R with a Variable as the Argument
Reduce Space Between Grid.Arrange Plots
Flatten Nested List into 1-Deep List
Difference of Prediction Results in Random Forest Model
Why Doesn't Comparison Between Numeric and Character Variables Give a Warning
Do You Reassign == and != to Istrue( All.Equal() )
Resetting Cumsum If Value Goes to Negative in R
How to Create an Infix %Between% Operator
Subtract Pairs of Columns Based on Matching Column
What Is the Internal Implementation of Lists
Filter Groups in Dplyr That Exclusively Contain Specific Combinations of Values
R Ggplot2 Boxplots - Ggpubr Stat_Compare_Means Not Working Properly
Categorical Scatter Plot with Mean Segments Using Ggplot2 in R