Weighted Mean by Row

Weighted mean by row

This seems to do the trick:

> apply(z, 1, function(x) weighted.mean(x[1:3], x[4:6]))
 [1] 14.7  7.3 16.7 17.7 18.7 19.7 20.7 21.7 22.7 24.3

This will probably be a bit faster, though less clear as to what's going on:

> rowSums(z[,1:3] * z[,4:6]) / rowSums(z[,4:6])
 [1] 14.7  7.3 16.7 17.7 18.7 19.7 20.7 21.7 22.7 24.3

How to calculate weighted means for each column (or row) of a matrix using the columns (or rows) from another matrix?

You may simply do

colSums(m*w)/colSums(w)  ## columns
# [1] 0.2519816 0.4546775 0.7812545
rowSums(m*w)/rowSums(w)  ## rows
# [1] 0.2147437 0.5273465 1.0559481

which should be fastest.

Or, if you stick to weighted.mean(), you may use mapply.

mapply(weighted.mean, as.data.frame(m), as.data.frame(w), USE.NAMES=F)  ## columns
# [1] 0.2519816 0.4546775 0.7812545
mapply(weighted.mean, as.data.frame(t(m)), as.data.frame(t(w)), USE.NAMES=F)  ## rows
# [1] 0.2147437 0.5273465 1.0559481

Weighted Mean Row wise with dynamically updated weights in Pandas

Let's try the following:

(i) Create a helper column: "ind".

(ii) Calculate LHS and RHS of sum_i(wt_2_i): ww

(iii) Calculate sum_i(wt_2_i) of each row: sm

(iv) Using helper column "ind", fill in "weighted_mean" column using the product of the previous row's ww and "val1" and "val2" values for each row.

df['ind'] = [1,0,1,0,1]
cols = ['var1','var2']
ww = (df[cols] + 1) * 0.5 # use initial weights here
sm = ww.sum(axis=1)
df['weighted_mean'] = (sm - 1).where(df['ind']==1, (df[cols] * ww.shift()).sum(axis=1) / sm)
df = df.drop(columns='ind')

Output:

            var1  var2  weighted_mean
datetime                             
2015-01-02  0.07  0.02       0.045000
2015-01-03  0.08  0.01       0.045837
2015-01-04  0.04  0.02       0.030000
2015-01-05  0.01  0.02       0.015172
2015-01-06  0.03  0.08       0.055000

Using weighted.mean() in mutate() to create rowwise weighted means

Try this way

df <- tibble(x = rnorm(10,0,5),
             y = rnorm(10,0,10))

df$z <- df %>% 
  rowwise %>%
  do(data.frame(
    z = weighted.mean(
      x = c(.$x, .$y),
      w = c(.2, .8)
    )
  )) %>%
  ungroup %>%
  magrittr::use_series("z")

df %>% 
  mutate( z= df %>% 
            rowwise %>% 
            do(data.frame(
    z = weighted.mean(
      x = c(.$x, .$y),
      w = c(.2, .8)
    )
  )))
  

            x       y     z
    <dbl>   <dbl>   <dbl>
 1  0.176  -1.95   -1.52 
 2 -3.33   -6.88   -6.17 
 3 -4.08    0.827  -0.154
 4  0.609   1.68    1.47 
 5  0.327   8.06    6.51 
 6 -8.63   -2.12   -3.42 
 7 -4.68   -8.52   -7.76 
 8  6.49  -13.0    -9.07 
 9 -2.95  -25.4   -20.9  
10  2.78    5.36    4.85

Calculate weighted average of dataframe rows with missing values

Implementing the idea in my comment above. which is simpler than I thought because the DataFrame.sum method seems to do fillna=0 automatically:

(df*w).sum(axis=1)/(~pd.isnull(df)*w).sum(axis=1)

will perform this operation in a vectorized way on all rows.

R - Weighted Mean by row for multiple columns based on columns string values

I guess you could get your weight vector like this:

library(tidyverse)

weights_precursor <- str_split(names(data)[-1], pattern = "\\.", n = 2, simplify = TRUE)[, 1] %>% 
  as.numeric()

weights <- 2305.2 * weights_precursor ^ -1.019

Weighted Mean by Row