Rolling Computation of Two Simultaneous Variables Iteratively or Rowwise, Using Three Other Given Variables

Rolling computation of two simultaneous variables iteratively or rowwise, using three other given variables

In this case, we require to generate two output columns simultaneously; and iteratively with the help of three inputs. So purrr::accumulate normally works on one output based on one input, whereas purrr::accumulate2() works on 2 inputs for again one output. So instead, my strategy for accumulate is as under:-

  • Re-arrange three input columns into row-wise tibbles each, so that each of three columns inputs are now a single column. For this, I generated a dummy column id so that each row is converted to a tibble each.
  • I used tidyr::nest_by() for this
  • again for output I generated a tibble instead of a vector through accumulate.
  • Lastly I converted both tibbles back to their original shapes by using tidyr::unnnest_wider()
crass[1:3] %>% 
nest_by(id = row_number()) %>%
ungroup() %>%
mutate(new = accumulate(data,
.init = list(Bid = 3801, Ask = 3802),
~ tibble(Bid = ifelse(.y$WT_TRADE_PRICE >= min(.x$Ask, (.x$Ask + .x$Bid)/2),
.x$Bid,
.y$min),
Ask = ifelse(.y$WT_TRADE_PRICE >= min(.x$Ask, (.x$Ask + .x$Bid)/2),
.y$max,
.x$Ask))
)[-1]) %>%
unnest_wider(data) %>%
unnest_wider(new)

# A tibble: 20 x 6
id WT_TRADE_PRICE min max Bid Ask
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 3801 3801 3801 3801 3802
2 2 3801 3801 3801 3801 3802
3 3 3801 3801 3801 3801 3802
4 4 3797 3797 3797 3797 3802
5 5 3797 3797 3797 3797 3802
6 6 3796. 3795 3800 3795 3802
7 7 3800 3800 3800 3795 3800
8 8 3797 3797 3797 3797 3800
9 9 3795. 3794 3797 3794 3800
10 10 3794 3794 3794 3794 3800
11 11 3793 3793 3793 3793 3800
12 12 3793 3793 3793 3793 3800
13 13 3794. 3793 3794 3793 3800
14 14 3795. 3794 3797 3794 3800
15 15 3793. 3790 3794 3790 3800
16 16 3789 3789 3789 3789 3800
17 17 3790 3790 3790 3790 3800
18 18 3788 3788 3788 3788 3800
19 19 3788 3788 3788 3788 3800
20 20 3788 3788 3788 3788 3800

Earlier Revised for loop

Syntax


for(i in 2:nrow(crass)){
if(crass[i, 1] >= min(crass[i-1, 5], (crass[i-1, 4] + crass[i-1, 5])/2)){
crass[i, 5] <- crass[i, 3]
crass[i, 4] <- crass[i-1, 4]
} else {
crass[i, 4] <- crass[i, 2]
crass[i, 5] <- crass[i-1, 5]
}
}

crass
# A tibble: 20 x 5
WT_TRADE_PRICE min max Bid Ask
<dbl> <dbl> <dbl> <dbl> <dbl>
1 3801 3801 3801 3801 3802
2 3801 3801 3801 3801 3802
3 3801 3801 3801 3801 3802
4 3797 3797 3797 3797 3802
5 3797 3797 3797 3797 3802
6 3796. 3795 3800 3795 3802
7 3800 3800 3800 3795 3800
8 3797 3797 3797 3797 3800
9 3795. 3794 3797 3794 3800
10 3794 3794 3794 3794 3800
11 3793 3793 3793 3793 3800
12 3793 3793 3793 3793 3800
13 3794. 3793 3794 3793 3800
14 3795. 3794 3797 3794 3800
15 3793. 3790 3794 3790 3800
16 3789 3789 3789 3789 3800
17 3790 3790 3790 3790 3800
18 3788 3788 3788 3788 3800
19 3788 3788 3788 3788 3800
20 3788 3788 3788 3788 3800

crass before running of for loop

# A tibble: 20 x 5
WT_TRADE_PRICE min max Bid Ask
<dbl> <dbl> <dbl> <dbl> <dbl>
1 3801 3801 3801 3801 3802
2 3801 3801 3801 NA NA
3 3801 3801 3801 NA NA
4 3797 3797 3797 NA NA
5 3797 3797 3797 NA NA
6 3796. 3795 3800 NA NA
7 3800 3800 3800 NA NA
8 3797 3797 3797 NA NA
9 3795. 3794 3797 NA NA
10 3794 3794 3794 NA NA
11 3793 3793 3793 NA NA
12 3793 3793 3793 NA NA
13 3794. 3793 3794 NA NA
14 3795. 3794 3797 NA NA
15 3793. 3790 3794 NA NA
16 3789 3789 3789 NA NA
17 3790 3790 3790 NA NA
18 3788 3788 3788 NA NA
19 3788 3788 3788 NA NA
20 3788 3788 3788 NA NA

How to use accumulate function to calculate 2 intertwining variables

In this case, we require to generate two output columns simultaneously; and iteratively with the help of one input column and given two initial values of the two output columns. Now, purrr::accumulate normally works on one output based on one input, whereas purrr::accumulate2() works on 2 inputs for again one output. So instead, my strategy for accumulate is elaborated as under:-

For further reference you may see this answer also where multiple results have been generated in purrr::accumulate using even more than 2 arguments (input values) simultaneously

  • I presume your first row is just to input the first/initial values in remaining values of column.
  • To have multiple outputs in accumulate you can make behave this function purrr::accumulate (I must say awesome function) as generating a tibble instead of vector.
  • I removed first row.
  • used your initial values in .init argument
  • created an output in tibble format
  • used syntax tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E, E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50)
  • If you look closely, C is generated with the given formula. But,
  • For E we require generated (output) value of C therefore, I used formula of C instead of using .x or .y there
  • finally, unnested the output with using unnest_wider in tidyr
library(tidyverse, warn.conflicts = F)

df <- structure(list(Scenario = c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1), SlNo = c(NA,
1L, 5L, 9L, 13L, 17L, 21L, 25L, 29L, 33L), A = c(NA, 14, 1, 17,
8, 10, 9, 14, 14, 4), B = c(NA, 19, 13, 20, 7, 16, 12, 18, 15,
17), C = c(4, 0, 0, 0, 0, 0, 0, 0, 0, 0), E = c(6, 0, 0, 0, 0,
0, 0, 0, 0, 0)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))

df[1:4] %>%
filter(Scenario != 0) %>%
mutate(new = accumulate(B,
.init = tibble(C = 4, E = 6),
~ tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E,
E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50
)
)[-1]
) %>%
unnest_wider(new)
#> # A tibble: 9 x 6
#> Scenario SlNo A B C E
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 14 19 32 1
#> 2 1 5 1 13 35 -2
#> 3 1 9 17 20 40.5 10.5
#> 4 1 13 8 7 42.8 -0.25
#> 5 1 17 10 16 42.1 8.12
#> 6 1 21 9 12 46.2 8.19
#> 7 1 25 14 18 54.3 22.3
#> 8 1 29 14 15 69.4 34.4
#> 9 1 33 4 17 91.1 58.1

Created on 2021-07-05 by the reprex package (v2.0.0)

Cumulatively add number between rows in order to create new columns BUT with diffent way for the first element occurent in a column in R

Since you can't compute all columns at once (you need to wait for previous iterations to be able to compute the result for line i), just use a loop. The row number per Sp1 can be done at once using dplyr:

df <- df %>% group_by(Sp1) %>% mutate(sp_row = row_number()) %>% ungroup()
df$new_start <- df$new_end <- NA
df$new_start[1] <- df$start[1]
df$new_end[1] <- df$end[1]
for( i in 2:nrow(df)) {
if(df$sp_row[i]==1) {
df$new_start[i] <- df$new_end[i-1]+1
df$new_end[i] <- df$new_start[i]+df$end[i]-df$start[i]
}
if(df$sp_row[i]!=1) {
df$new_start[i] <- df$start[i]-df$new_end[i-1]
df$new_end[i] <- df$new_start[i]+df$end[i]-df$start[i]
}
}
# A tibble: 5 x 6
Sp1 start end new_start new_end sp_row
<chr> <int> <dbl> <dbl> <dbl> <int>
1 A 100 1077 100 1077 1
2 B 2316 4088 1078 2850 1
3 B 26647 28746 23797 25896 2
4 B 50000 60000 24104 34104 3
5 C 450 789 34105 34444 1

There is at least one mistake in your example btw: 50000-25896 = 29053 is wrong.

Using pmap with c(...) part 2

The issue seems to be mixing the custom anonymous/lambda function (function(Weight, Days, ...) - where the arguments are named as the same as the column name) with the default lambda function (~ - where the arguments are .x, .y if only two elements or if more than two - ..1, ..2, ..3 etc). In the OP's code

library(dplyr)
library(purrr)
df %>%
mutate(pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days))))

The 'Weight', 'Days' returns the full column values from original dataset and not from rows. If we want to still make use of the above command, we need to convert the data captured in each row to a tibble and use with

df %>%
pmap_dfr(., ~ with(as_tibble(list(...)),
setNames(rep(Weight, Days), seq_len(Days))))
# A tibble: 3 x 7
# `1` `2` `3` `4` `5` `6` `7`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 0.03 0.03 0.03 0.03 0.03 NA NA
#2 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 0.04 0.04 0.04 NA NA NA NA

If we want the other columns,

df %>%
pmap_dfr(., ~ c(list(...)[-(3:4)], with(as_tibble(list(...)),
setNames(rep(Weight, Days), seq_len(Days)))))
# A tibble: 3 x 9
# Name School `1` `2` `3` `4` `5` `6` `7`
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
#2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA

Or use rowwise

library(tidyr)
df %>%
rowwise %>%
mutate(out = list(setNames(rep(Weight, Days), seq_len(Days)))) %>%
ungroup %>%
unnest_wider(c(out)) %>%
select(-Weight, -Days)
# A tibble: 3 x 9
# Name School `1` `2` `3` `4` `5` `6` `7`
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
#2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA

Fast way to calculate values in cells based on values in previous rows

Though My friend's output/strategy is fabulous, but since we cannot have two input vectors in baseR's Reduce() so I used this trick-

  • Generated fresh values of var1 in data.frame() inside the Reduce()
  • Where you want to use current values of var1 use .y
  • where previous values were to be used use .x$var1 instead.
  • used formula where I'd require to use current generated value of any variable.
  • rest is pretty clear I think.
  • accumulate = TRUE is obvious becuase you want all intermediate values.
  • Since output here will be a list, that is rbind using do.call

In base R you can do

do.call(rbind, Reduce(function(.x, .y) {data.frame(var1 = .y,
var2 = .x$var2 + .x$var3 -.y/constant,
var3 = .x$var1 + 0.1 * (.x$var2 + .x$var3 -.y/constant)/constant)},
dt$var1[-1],
init = data.frame(var1 = dt$var1[1], var2 = -3.12, var3 = 1),
accumulate = TRUE))

var1 var2 var3
1 -92186.747 -3.120000e+00 1.00
2 -19163.504 -2.088501e+00 -92186.75
3 -18178.840 -9.218881e+04 -19163.52
4 -9844.679 -1.113523e+05 -18178.86
5 -16494.780 -1.295311e+05 -9844.70
6 -17088.058 -1.393758e+05 -16494.80

which can be emulated in tidyverse/purrr as follows

library(purrr)
accumulate(dt$var1[-1], .init = data.frame(var1 = dt$var1[1], var2 = -3.12, var3 = 1),
~ data.frame(var1 = .y,
var2 = .x$var2 + .x$var3 -(.y/constant),
var3 = .x$var1 + 0.1 * (.x$var2 + .x$var3 -(.y/constant))/constant)) %>% map_df(~.x)

var1 var2 var3
1 -92186.747 -3.120000e+00 1.00
2 -19163.504 -2.088501e+00 -92186.75
3 -18178.840 -9.218881e+04 -19163.52
4 -9844.679 -1.113523e+05 -18178.86
5 -16494.780 -1.295311e+05 -9844.70
6 -17088.058 -1.393758e+05 -16494.80

R: pass multiple arguments to accumulate/reduce

If we want to use accumulate2, then specify the arguments correctly i.e. it takes two input arguments as 'pw' and 'add' and an initialization argument which would be the first value of 'x'. As it is a grouped by 'ID', do the grouping before we do the accumulate2, extract the lambda default arguments ..1, ..2 and ..3 respectively in that order and create the recursive function based on this

library(dplyr)
library(purrr)
out <- df %>%
group_by(ID) %>%
mutate(x1 = accumulate2(pw[-1], add[-1], ~ ..1^..2 + ..3,
.init = first(x)) %>%
flatten_dbl ) %>%
ungroup

out$x1
#[1] 1 2 11
#[4] 4 1031 1201024845477409792

With more than 3 arguments, a for loop would be better

# // initialize an empty vector
out <- c()
# // loop over the `unique` ID
for(id in unique(df$ID)) {
# // create a temporary subset of data based on that id
tmp_df <- subset(df, ID == id)
# // initialize a temporary storage output
tmp_out <- numeric(nrow(tmp_df))
# // initialize first value with the first element of x
tmp_out[1] <- tmp_df$x[1]
# // if the number of rows is greater than 1
if(nrow(tmp_df) > 1) {
// loop over the rows
for(i in 2:nrow(tmp_df)) {
#// do the recursive calculation and update
tmp_out[i] <- tmp_out[i - 1]^ tmp_df$pw[i] + tmp_df$add[i]
}
}

out <- c(out, tmp_out)

}

out
#[1] 1 2 11
#[4] 4 1031 1201024845477409792


Related Topics



Leave a reply



Submit