Weighted mean by row
This seems to do the trick:
> apply(z, 1, function(x) weighted.mean(x[1:3], x[4:6]))
[1] 14.7 7.3 16.7 17.7 18.7 19.7 20.7 21.7 22.7 24.3
This will probably be a bit faster, though less clear as to what's going on:
> rowSums(z[,1:3] * z[,4:6]) / rowSums(z[,4:6])
[1] 14.7 7.3 16.7 17.7 18.7 19.7 20.7 21.7 22.7 24.3
How to calculate weighted means for each column (or row) of a matrix using the columns (or rows) from another matrix?
You may simply do
colSums(m*w)/colSums(w) ## columns
# [1] 0.2519816 0.4546775 0.7812545
rowSums(m*w)/rowSums(w) ## rows
# [1] 0.2147437 0.5273465 1.0559481
which should be fastest.
Or, if you stick to weighted.mean()
, you may use mapply
.
mapply(weighted.mean, as.data.frame(m), as.data.frame(w), USE.NAMES=F) ## columns
# [1] 0.2519816 0.4546775 0.7812545
mapply(weighted.mean, as.data.frame(t(m)), as.data.frame(t(w)), USE.NAMES=F) ## rows
# [1] 0.2147437 0.5273465 1.0559481
Weighted Mean Row wise with dynamically updated weights in Pandas
Let's try the following:
(i) Create a helper column: "ind".
(ii) Calculate LHS and RHS of sum_i(wt_2_i)
: ww
(iii) Calculate sum_i(wt_2_i)
of each row: sm
(iv) Using helper column "ind", fill in "weighted_mean" column using the product of the previous row's ww
and "val1" and "val2" values for each row.
df['ind'] = [1,0,1,0,1]
cols = ['var1','var2']
ww = (df[cols] + 1) * 0.5 # use initial weights here
sm = ww.sum(axis=1)
df['weighted_mean'] = (sm - 1).where(df['ind']==1, (df[cols] * ww.shift()).sum(axis=1) / sm)
df = df.drop(columns='ind')
Output:
var1 var2 weighted_mean
datetime
2015-01-02 0.07 0.02 0.045000
2015-01-03 0.08 0.01 0.045837
2015-01-04 0.04 0.02 0.030000
2015-01-05 0.01 0.02 0.015172
2015-01-06 0.03 0.08 0.055000
Using weighted.mean() in mutate() to create rowwise weighted means
Try this way
df <- tibble(x = rnorm(10,0,5),
y = rnorm(10,0,10))
df$z <- df %>%
rowwise %>%
do(data.frame(
z = weighted.mean(
x = c(.$x, .$y),
w = c(.2, .8)
)
)) %>%
ungroup %>%
magrittr::use_series("z")
or
df %>%
mutate( z= df %>%
rowwise %>%
do(data.frame(
z = weighted.mean(
x = c(.$x, .$y),
w = c(.2, .8)
)
)))
x y z
<dbl> <dbl> <dbl>
1 0.176 -1.95 -1.52
2 -3.33 -6.88 -6.17
3 -4.08 0.827 -0.154
4 0.609 1.68 1.47
5 0.327 8.06 6.51
6 -8.63 -2.12 -3.42
7 -4.68 -8.52 -7.76
8 6.49 -13.0 -9.07
9 -2.95 -25.4 -20.9
10 2.78 5.36 4.85
Calculate weighted average of dataframe rows with missing values
Implementing the idea in my comment above. which is simpler than I thought because the DataFrame.sum
method seems to do fillna=0
automatically:
(df*w).sum(axis=1)/(~pd.isnull(df)*w).sum(axis=1)
will perform this operation in a vectorized way on all rows.
R - Weighted Mean by row for multiple columns based on columns string values
I guess you could get your weight vector like this:
library(tidyverse)
weights_precursor <- str_split(names(data)[-1], pattern = "\\.", n = 2, simplify = TRUE)[, 1] %>%
as.numeric()
weights <- 2305.2 * weights_precursor ^ -1.019
Related Topics
How to Use Multiple Cores to Make Gganimate Faster
Separate String After Last Underscore
Staggered and Stacked Geom_Bar in The Same Figure
"Nas Introduced by Coercion" During Cluster Analysis in R
Ggplot and Axis Numbers and Labels
R Shiny: How to Change The Background Color of The Header
How to Set R to Default Options
Under What Circumstances Does R Recycle
How to Programmatically Create Binary Columns Based on a Categorical Variable in Data.Table
Ifelse Assignment in Data.Table
Change The Year in a Datetime Object in R
Multiplication of Large Integers
Get Start and End Index of Runs of Values
Rstudio Viewer Pane Not Working
All Paths in Directed Tree Graph from Root to Leaves in Igraph R
Using Inst/Extdata with Vignette During Package Checking R 2.14.0