Rolling computation of two simultaneous variables iteratively or rowwise, using three other given variables
In this case, we require to generate two output columns simultaneously; and iteratively with the help of three inputs. So purrr::accumulate
normally works on one output based on one input, whereas purrr::accumulate2()
works on 2 inputs for again one output. So instead, my strategy for accumulate
is as under:-
- Re-arrange three input columns into row-wise tibbles each, so that each of three columns inputs are now a single column. For this, I generated a dummy column
id
so that each row is converted to a tibble each. - I used
tidyr::nest_by()
for this - again for output I generated a tibble instead of a vector through accumulate.
- Lastly I converted both tibbles back to their original shapes by using
tidyr::unnnest_wider()
crass[1:3] %>%
nest_by(id = row_number()) %>%
ungroup() %>%
mutate(new = accumulate(data,
.init = list(Bid = 3801, Ask = 3802),
~ tibble(Bid = ifelse(.y$WT_TRADE_PRICE >= min(.x$Ask, (.x$Ask + .x$Bid)/2),
.x$Bid,
.y$min),
Ask = ifelse(.y$WT_TRADE_PRICE >= min(.x$Ask, (.x$Ask + .x$Bid)/2),
.y$max,
.x$Ask))
)[-1]) %>%
unnest_wider(data) %>%
unnest_wider(new)
# A tibble: 20 x 6
id WT_TRADE_PRICE min max Bid Ask
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 3801 3801 3801 3801 3802
2 2 3801 3801 3801 3801 3802
3 3 3801 3801 3801 3801 3802
4 4 3797 3797 3797 3797 3802
5 5 3797 3797 3797 3797 3802
6 6 3796. 3795 3800 3795 3802
7 7 3800 3800 3800 3795 3800
8 8 3797 3797 3797 3797 3800
9 9 3795. 3794 3797 3794 3800
10 10 3794 3794 3794 3794 3800
11 11 3793 3793 3793 3793 3800
12 12 3793 3793 3793 3793 3800
13 13 3794. 3793 3794 3793 3800
14 14 3795. 3794 3797 3794 3800
15 15 3793. 3790 3794 3790 3800
16 16 3789 3789 3789 3789 3800
17 17 3790 3790 3790 3790 3800
18 18 3788 3788 3788 3788 3800
19 19 3788 3788 3788 3788 3800
20 20 3788 3788 3788 3788 3800
Earlier Revised for loop
Syntax
for(i in 2:nrow(crass)){
if(crass[i, 1] >= min(crass[i-1, 5], (crass[i-1, 4] + crass[i-1, 5])/2)){
crass[i, 5] <- crass[i, 3]
crass[i, 4] <- crass[i-1, 4]
} else {
crass[i, 4] <- crass[i, 2]
crass[i, 5] <- crass[i-1, 5]
}
}
crass
# A tibble: 20 x 5
WT_TRADE_PRICE min max Bid Ask
<dbl> <dbl> <dbl> <dbl> <dbl>
1 3801 3801 3801 3801 3802
2 3801 3801 3801 3801 3802
3 3801 3801 3801 3801 3802
4 3797 3797 3797 3797 3802
5 3797 3797 3797 3797 3802
6 3796. 3795 3800 3795 3802
7 3800 3800 3800 3795 3800
8 3797 3797 3797 3797 3800
9 3795. 3794 3797 3794 3800
10 3794 3794 3794 3794 3800
11 3793 3793 3793 3793 3800
12 3793 3793 3793 3793 3800
13 3794. 3793 3794 3793 3800
14 3795. 3794 3797 3794 3800
15 3793. 3790 3794 3790 3800
16 3789 3789 3789 3789 3800
17 3790 3790 3790 3790 3800
18 3788 3788 3788 3788 3800
19 3788 3788 3788 3788 3800
20 3788 3788 3788 3788 3800
crass
before running of for loop
# A tibble: 20 x 5
WT_TRADE_PRICE min max Bid Ask
<dbl> <dbl> <dbl> <dbl> <dbl>
1 3801 3801 3801 3801 3802
2 3801 3801 3801 NA NA
3 3801 3801 3801 NA NA
4 3797 3797 3797 NA NA
5 3797 3797 3797 NA NA
6 3796. 3795 3800 NA NA
7 3800 3800 3800 NA NA
8 3797 3797 3797 NA NA
9 3795. 3794 3797 NA NA
10 3794 3794 3794 NA NA
11 3793 3793 3793 NA NA
12 3793 3793 3793 NA NA
13 3794. 3793 3794 NA NA
14 3795. 3794 3797 NA NA
15 3793. 3790 3794 NA NA
16 3789 3789 3789 NA NA
17 3790 3790 3790 NA NA
18 3788 3788 3788 NA NA
19 3788 3788 3788 NA NA
20 3788 3788 3788 NA NA
How to use accumulate function to calculate 2 intertwining variables
In this case, we require to generate two output columns simultaneously; and iteratively with the help of one input column and given two initial values of the two output columns. Now, purrr::accumulate
normally works on one output based on one input, whereas purrr::accumulate2()
works on 2 inputs for again one output. So instead, my strategy for accumulate is elaborated as under:-
For further reference you may see this answer also where multiple results have been generated in purrr::accumulate
using even more than 2 arguments (input values) simultaneously
- I presume your first row is just to input the first/initial values in remaining values of column.
- To have multiple outputs in
accumulate
you can make behave this functionpurrr::accumulate
(I must say awesome function) as generating a tibble instead of vector. - I removed first row.
- used your initial values in
.init
argument - created an output in
tibble
format - used syntax
tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E, E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50)
- If you look closely,
C
is generated with the given formula. But, - For
E
we require generated (output) value ofC
therefore, I used formula ofC
instead of using.x
or.y
there - finally, unnested the output with using
unnest_wider
intidyr
library(tidyverse, warn.conflicts = F)
df <- structure(list(Scenario = c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1), SlNo = c(NA,
1L, 5L, 9L, 13L, 17L, 21L, 25L, 29L, 33L), A = c(NA, 14, 1, 17,
8, 10, 9, 14, 14, 4), B = c(NA, 19, 13, 20, 7, 16, 12, 18, 15,
17), C = c(4, 0, 0, 0, 0, 0, 0, 0, 0, 0), E = c(6, 0, 0, 0, 0,
0, 0, 0, 0, 0)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))
df[1:4] %>%
filter(Scenario != 0) %>%
mutate(new = accumulate(B,
.init = tibble(C = 4, E = 6),
~ tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E,
E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50
)
)[-1]
) %>%
unnest_wider(new)
#> # A tibble: 9 x 6
#> Scenario SlNo A B C E
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 14 19 32 1
#> 2 1 5 1 13 35 -2
#> 3 1 9 17 20 40.5 10.5
#> 4 1 13 8 7 42.8 -0.25
#> 5 1 17 10 16 42.1 8.12
#> 6 1 21 9 12 46.2 8.19
#> 7 1 25 14 18 54.3 22.3
#> 8 1 29 14 15 69.4 34.4
#> 9 1 33 4 17 91.1 58.1
Created on 2021-07-05 by the reprex package (v2.0.0)
Cumulatively add number between rows in order to create new columns BUT with diffent way for the first element occurent in a column in R
Since you can't compute all columns at once (you need to wait for previous iterations to be able to compute the result for line i
), just use a loop. The row number per Sp1
can be done at once using dplyr
:
df <- df %>% group_by(Sp1) %>% mutate(sp_row = row_number()) %>% ungroup()
df$new_start <- df$new_end <- NA
df$new_start[1] <- df$start[1]
df$new_end[1] <- df$end[1]
for( i in 2:nrow(df)) {
if(df$sp_row[i]==1) {
df$new_start[i] <- df$new_end[i-1]+1
df$new_end[i] <- df$new_start[i]+df$end[i]-df$start[i]
}
if(df$sp_row[i]!=1) {
df$new_start[i] <- df$start[i]-df$new_end[i-1]
df$new_end[i] <- df$new_start[i]+df$end[i]-df$start[i]
}
}
# A tibble: 5 x 6
Sp1 start end new_start new_end sp_row
<chr> <int> <dbl> <dbl> <dbl> <int>
1 A 100 1077 100 1077 1
2 B 2316 4088 1078 2850 1
3 B 26647 28746 23797 25896 2
4 B 50000 60000 24104 34104 3
5 C 450 789 34105 34444 1
There is at least one mistake in your example btw: 50000-25896 = 29053
is wrong.
Using pmap with c(...) part 2
The issue seems to be mixing the custom anonymous/lambda function (function(Weight, Days, ...)
- where the arguments are named as the same as the column name) with the default lambda function (~
- where the arguments are .x
, .y
if only two elements or if more than two - ..1
, ..2
, ..3
etc). In the OP's code
library(dplyr)
library(purrr)
df %>%
mutate(pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days))))
The 'Weight', 'Days' returns the full column values from original dataset and not from rows. If we want to still make use of the above command, we need to convert the data captured in each row to a tibble
and use with
df %>%
pmap_dfr(., ~ with(as_tibble(list(...)),
setNames(rep(Weight, Days), seq_len(Days))))
# A tibble: 3 x 7
# `1` `2` `3` `4` `5` `6` `7`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 0.03 0.03 0.03 0.03 0.03 NA NA
#2 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 0.04 0.04 0.04 NA NA NA NA
If we want the other columns,
df %>%
pmap_dfr(., ~ c(list(...)[-(3:4)], with(as_tibble(list(...)),
setNames(rep(Weight, Days), seq_len(Days)))))
# A tibble: 3 x 9
# Name School `1` `2` `3` `4` `5` `6` `7`
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
#2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Or use rowwise
library(tidyr)
df %>%
rowwise %>%
mutate(out = list(setNames(rep(Weight, Days), seq_len(Days)))) %>%
ungroup %>%
unnest_wider(c(out)) %>%
select(-Weight, -Days)
# A tibble: 3 x 9
# Name School `1` `2` `3` `4` `5` `6` `7`
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
#2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Fast way to calculate values in cells based on values in previous rows
Though My friend's output/strategy is fabulous, but since we cannot have two input vectors in baseR's Reduce()
so I used this trick-
- Generated fresh values of
var1
indata.frame()
inside theReduce()
- Where you want to use current values of
var1
use.y
- where previous values were to be used use
.x$var1
instead. - used formula where I'd require to use current generated value of any variable.
- rest is pretty clear I think.
accumulate = TRUE
is obvious becuase you want all intermediate values.- Since output here will be a list, that is
rbind
usingdo.call
In base R you can do
do.call(rbind, Reduce(function(.x, .y) {data.frame(var1 = .y,
var2 = .x$var2 + .x$var3 -.y/constant,
var3 = .x$var1 + 0.1 * (.x$var2 + .x$var3 -.y/constant)/constant)},
dt$var1[-1],
init = data.frame(var1 = dt$var1[1], var2 = -3.12, var3 = 1),
accumulate = TRUE))
var1 var2 var3
1 -92186.747 -3.120000e+00 1.00
2 -19163.504 -2.088501e+00 -92186.75
3 -18178.840 -9.218881e+04 -19163.52
4 -9844.679 -1.113523e+05 -18178.86
5 -16494.780 -1.295311e+05 -9844.70
6 -17088.058 -1.393758e+05 -16494.80
which can be emulated in tidyverse/purrr as follows
library(purrr)
accumulate(dt$var1[-1], .init = data.frame(var1 = dt$var1[1], var2 = -3.12, var3 = 1),
~ data.frame(var1 = .y,
var2 = .x$var2 + .x$var3 -(.y/constant),
var3 = .x$var1 + 0.1 * (.x$var2 + .x$var3 -(.y/constant))/constant)) %>% map_df(~.x)
var1 var2 var3
1 -92186.747 -3.120000e+00 1.00
2 -19163.504 -2.088501e+00 -92186.75
3 -18178.840 -9.218881e+04 -19163.52
4 -9844.679 -1.113523e+05 -18178.86
5 -16494.780 -1.295311e+05 -9844.70
6 -17088.058 -1.393758e+05 -16494.80
R: pass multiple arguments to accumulate/reduce
If we want to use accumulate2
, then specify the arguments correctly i.e. it takes two input arguments as 'pw' and 'add' and an initialization argument which would be the first
value of 'x'. As it is a grouped by 'ID', do the grouping before we do the accumulate2
, extract the lambda default arguments ..1
, ..2
and ..3
respectively in that order and create the recursive function based on this
library(dplyr)
library(purrr)
out <- df %>%
group_by(ID) %>%
mutate(x1 = accumulate2(pw[-1], add[-1], ~ ..1^..2 + ..3,
.init = first(x)) %>%
flatten_dbl ) %>%
ungroup
out$x1
#[1] 1 2 11
#[4] 4 1031 1201024845477409792
With more than 3 arguments, a for
loop would be better
# // initialize an empty vector
out <- c()
# // loop over the `unique` ID
for(id in unique(df$ID)) {
# // create a temporary subset of data based on that id
tmp_df <- subset(df, ID == id)
# // initialize a temporary storage output
tmp_out <- numeric(nrow(tmp_df))
# // initialize first value with the first element of x
tmp_out[1] <- tmp_df$x[1]
# // if the number of rows is greater than 1
if(nrow(tmp_df) > 1) {
// loop over the rows
for(i in 2:nrow(tmp_df)) {
#// do the recursive calculation and update
tmp_out[i] <- tmp_out[i - 1]^ tmp_df$pw[i] + tmp_df$add[i]
}
}
out <- c(out, tmp_out)
}
out
#[1] 1 2 11
#[4] 4 1031 1201024845477409792
Related Topics
Installing Rcppeigen on Amazon Ec2
How to Speed Up or Vectorize a for Loop
Predict() with Arbitrary Coefficients in R
Gcc: Error: Libgomp.Spec: No Such File or Directory with Amazon Linux 2017.09.1
R Ggplot2 Boxplots - Ggpubr Stat_Compare_Means Not Working Properly
R - How to Use Selectinput in Shiny to Change the X and Fill Variables in a Ggplot Renderplot
Disabling/Enabling Sidebar from Server Side
R - Reading Lines from a .Txt-File After a Specific Line
How to Include Custom CSS in HTMLwidgets for R And/Or Leafletr
How to Create a Hyperlink Interactively in Shiny App
Changing Multiple Column Values Given a Condition in Dplyr
Change Background Color of Networkd3 Plot
How to Create a Variable of Rownames
As(X, 'Double') and As.Double(X) Are Inconsistent
What's the Easiest Way to Deploy an API Incorporating R Functions