Weighted Mean by Row

Weighted mean by row

This seems to do the trick:

> apply(z, 1, function(x) weighted.mean(x[1:3], x[4:6]))
[1] 14.7 7.3 16.7 17.7 18.7 19.7 20.7 21.7 22.7 24.3

This will probably be a bit faster, though less clear as to what's going on:

> rowSums(z[,1:3] * z[,4:6]) / rowSums(z[,4:6])
[1] 14.7 7.3 16.7 17.7 18.7 19.7 20.7 21.7 22.7 24.3

How to calculate weighted means for each column (or row) of a matrix using the columns (or rows) from another matrix?

You may simply do

colSums(m*w)/colSums(w)  ## columns
# [1] 0.2519816 0.4546775 0.7812545
rowSums(m*w)/rowSums(w) ## rows
# [1] 0.2147437 0.5273465 1.0559481

which should be fastest.

Or, if you stick to weighted.mean(), you may use mapply.

mapply(weighted.mean, as.data.frame(m), as.data.frame(w), USE.NAMES=F)  ## columns
# [1] 0.2519816 0.4546775 0.7812545
mapply(weighted.mean, as.data.frame(t(m)), as.data.frame(t(w)), USE.NAMES=F) ## rows
# [1] 0.2147437 0.5273465 1.0559481

Weighted Mean Row wise with dynamically updated weights in Pandas

Let's try the following:

(i) Create a helper column: "ind".

(ii) Calculate LHS and RHS of sum_i(wt_2_i): ww

(iii) Calculate sum_i(wt_2_i) of each row: sm

(iv) Using helper column "ind", fill in "weighted_mean" column using the product of the previous row's ww and "val1" and "val2" values for each row.

df['ind'] = [1,0,1,0,1]
cols = ['var1','var2']
ww = (df[cols] + 1) * 0.5 # use initial weights here
sm = ww.sum(axis=1)
df['weighted_mean'] = (sm - 1).where(df['ind']==1, (df[cols] * ww.shift()).sum(axis=1) / sm)
df = df.drop(columns='ind')

Output:

            var1  var2  weighted_mean
datetime
2015-01-02 0.07 0.02 0.045000
2015-01-03 0.08 0.01 0.045837
2015-01-04 0.04 0.02 0.030000
2015-01-05 0.01 0.02 0.015172
2015-01-06 0.03 0.08 0.055000

Using weighted.mean() in mutate() to create rowwise weighted means

Try this way

df <- tibble(x = rnorm(10,0,5),
y = rnorm(10,0,10))

df$z <- df %>%
rowwise %>%
do(data.frame(
z = weighted.mean(
x = c(.$x, .$y),
w = c(.2, .8)
)
)) %>%
ungroup %>%
magrittr::use_series("z")

or

df %>% 
mutate( z= df %>%
rowwise %>%
do(data.frame(
z = weighted.mean(
x = c(.$x, .$y),
w = c(.2, .8)
)
)))


x y z
<dbl> <dbl> <dbl>
1 0.176 -1.95 -1.52
2 -3.33 -6.88 -6.17
3 -4.08 0.827 -0.154
4 0.609 1.68 1.47
5 0.327 8.06 6.51
6 -8.63 -2.12 -3.42
7 -4.68 -8.52 -7.76
8 6.49 -13.0 -9.07
9 -2.95 -25.4 -20.9
10 2.78 5.36 4.85

Calculate weighted average of dataframe rows with missing values

Implementing the idea in my comment above. which is simpler than I thought because the DataFrame.sum method seems to do fillna=0 automatically:

(df*w).sum(axis=1)/(~pd.isnull(df)*w).sum(axis=1)

will perform this operation in a vectorized way on all rows.

R - Weighted Mean by row for multiple columns based on columns string values

I guess you could get your weight vector like this:

library(tidyverse)

weights_precursor <- str_split(names(data)[-1], pattern = "\\.", n = 2, simplify = TRUE)[, 1] %>%
as.numeric()

weights <- 2305.2 * weights_precursor ^ -1.019


Related Topics



Leave a reply



Submit