Rolling Computation of Two Simultaneous Variables Iteratively or Rowwise, Using Three Other Given Variables

Rolling computation of two simultaneous variables iteratively or rowwise, using three other given variables

In this case, we require to generate two output columns simultaneously; and iteratively with the help of three inputs. So purrr::accumulate normally works on one output based on one input, whereas purrr::accumulate2() works on 2 inputs for again one output. So instead, my strategy for accumulate is as under:-

Re-arrange three input columns into row-wise tibbles each, so that each of three columns inputs are now a single column. For this, I generated a dummy column id so that each row is converted to a tibble each.
I used tidyr::nest_by() for this
again for output I generated a tibble instead of a vector through accumulate.
Lastly I converted both tibbles back to their original shapes by using tidyr::unnnest_wider()

crass[1:3] %>% 
  nest_by(id = row_number()) %>%
  ungroup() %>%
  mutate(new = accumulate(data, 
                          .init = list(Bid = 3801, Ask = 3802),
                          ~ tibble(Bid = ifelse(.y$WT_TRADE_PRICE >= min(.x$Ask, (.x$Ask + .x$Bid)/2),
                                                     .x$Bid,
                                                     .y$min),
                                        Ask = ifelse(.y$WT_TRADE_PRICE >= min(.x$Ask, (.x$Ask + .x$Bid)/2),
                                                     .y$max,
                                                     .x$Ask))
                          )[-1]) %>%
  unnest_wider(data) %>%
  unnest_wider(new)

# A tibble: 20 x 6
      id WT_TRADE_PRICE   min   max   Bid   Ask
   <int>          <dbl> <dbl> <dbl> <dbl> <dbl>
 1     1          3801   3801  3801  3801  3802
 2     2          3801   3801  3801  3801  3802
 3     3          3801   3801  3801  3801  3802
 4     4          3797   3797  3797  3797  3802
 5     5          3797   3797  3797  3797  3802
 6     6          3796.  3795  3800  3795  3802
 7     7          3800   3800  3800  3795  3800
 8     8          3797   3797  3797  3797  3800
 9     9          3795.  3794  3797  3794  3800
10    10          3794   3794  3794  3794  3800
11    11          3793   3793  3793  3793  3800
12    12          3793   3793  3793  3793  3800
13    13          3794.  3793  3794  3793  3800
14    14          3795.  3794  3797  3794  3800
15    15          3793.  3790  3794  3790  3800
16    16          3789   3789  3789  3789  3800
17    17          3790   3790  3790  3790  3800
18    18          3788   3788  3788  3788  3800
19    19          3788   3788  3788  3788  3800
20    20          3788   3788  3788  3788  3800

Earlier Revised for loop

Syntax


for(i in 2:nrow(crass)){
    if(crass[i, 1] >= min(crass[i-1, 5], (crass[i-1, 4] + crass[i-1, 5])/2)){
    crass[i, 5] <- crass[i, 3]
    crass[i, 4] <- crass[i-1, 4]
  } else {
    crass[i, 4] <- crass[i, 2]
    crass[i, 5] <- crass[i-1, 5]
  } 
}

crass
# A tibble: 20 x 5
   WT_TRADE_PRICE   min   max   Bid   Ask
            <dbl> <dbl> <dbl> <dbl> <dbl>
 1          3801   3801  3801  3801  3802
 2          3801   3801  3801  3801  3802
 3          3801   3801  3801  3801  3802
 4          3797   3797  3797  3797  3802
 5          3797   3797  3797  3797  3802
 6          3796.  3795  3800  3795  3802
 7          3800   3800  3800  3795  3800
 8          3797   3797  3797  3797  3800
 9          3795.  3794  3797  3794  3800
10          3794   3794  3794  3794  3800
11          3793   3793  3793  3793  3800
12          3793   3793  3793  3793  3800
13          3794.  3793  3794  3793  3800
14          3795.  3794  3797  3794  3800
15          3793.  3790  3794  3790  3800
16          3789   3789  3789  3789  3800
17          3790   3790  3790  3790  3800
18          3788   3788  3788  3788  3800
19          3788   3788  3788  3788  3800
20          3788   3788  3788  3788  3800

crass before running of for loop

# A tibble: 20 x 5
   WT_TRADE_PRICE   min   max   Bid   Ask
            <dbl> <dbl> <dbl> <dbl> <dbl>
 1          3801   3801  3801  3801  3802
 2          3801   3801  3801    NA    NA
 3          3801   3801  3801    NA    NA
 4          3797   3797  3797    NA    NA
 5          3797   3797  3797    NA    NA
 6          3796.  3795  3800    NA    NA
 7          3800   3800  3800    NA    NA
 8          3797   3797  3797    NA    NA
 9          3795.  3794  3797    NA    NA
10          3794   3794  3794    NA    NA
11          3793   3793  3793    NA    NA
12          3793   3793  3793    NA    NA
13          3794.  3793  3794    NA    NA
14          3795.  3794  3797    NA    NA
15          3793.  3790  3794    NA    NA
16          3789   3789  3789    NA    NA
17          3790   3790  3790    NA    NA
18          3788   3788  3788    NA    NA
19          3788   3788  3788    NA    NA
20          3788   3788  3788    NA    NA

How to use accumulate function to calculate 2 intertwining variables

In this case, we require to generate two output columns simultaneously; and iteratively with the help of one input column and given two initial values of the two output columns. Now, purrr::accumulate normally works on one output based on one input, whereas purrr::accumulate2() works on 2 inputs for again one output. So instead, my strategy for accumulate is elaborated as under:-

For further reference you may see this answer also where multiple results have been generated in purrr::accumulate using even more than 2 arguments (input values) simultaneously

I presume your first row is just to input the first/initial values in remaining values of column.
To have multiple outputs in accumulate you can make behave this function purrr::accumulate (I must say awesome function) as generating a tibble instead of vector.
I removed first row.
used your initial values in .init argument
created an output in tibble format
used syntax tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E, E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50)
If you look closely, C is generated with the given formula. But,
For E we require generated (output) value of C therefore, I used formula of C instead of using .x or .y there
finally, unnested the output with using unnest_wider in tidyr

library(tidyverse, warn.conflicts = F)

df <- structure(list(Scenario = c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1), SlNo = c(NA, 
                                                                          1L, 5L, 9L, 13L, 17L, 21L, 25L, 29L, 33L), A = c(NA, 14, 1, 17, 
                                                                                                                           8, 10, 9, 14, 14, 4), B = c(NA, 19, 13, 20, 7, 16, 12, 18, 15, 
                                                                                                                                                       17), C = c(4, 0, 0, 0, 0, 0, 0, 0, 0, 0), E = c(6, 0, 0, 0, 0, 
                                                                                                                                                                                                       0, 0, 0, 0, 0)), row.names = c(NA, -10L), class = c("tbl_df", 
                                                                                                                                                                                                                                                           "tbl", "data.frame"))

df[1:4] %>%
  filter(Scenario != 0) %>%
  mutate(new = accumulate(B, 
                          .init = tibble(C = 4, E = 6),
                          ~ tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E,
                                   E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50
                                   )
                          )[-1]
         ) %>% 
  unnest_wider(new)
#> # A tibble: 9 x 6
#>   Scenario  SlNo     A     B     C     E
#>      <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1        1     1    14    19  32    1   
#> 2        1     5     1    13  35   -2   
#> 3        1     9    17    20  40.5 10.5 
#> 4        1    13     8     7  42.8 -0.25
#> 5        1    17    10    16  42.1  8.12
#> 6        1    21     9    12  46.2  8.19
#> 7        1    25    14    18  54.3 22.3 
#> 8        1    29    14    15  69.4 34.4 
#> 9        1    33     4    17  91.1 58.1

^{Created on 2021-07-05 by the reprex package (v2.0.0)}

Cumulatively add number between rows in order to create new columns BUT with diffent way for the first element occurent in a column in R

Since you can't compute all columns at once (you need to wait for previous iterations to be able to compute the result for line i), just use a loop. The row number per Sp1 can be done at once using dplyr:

df <- df %>% group_by(Sp1) %>% mutate(sp_row = row_number()) %>% ungroup()
df$new_start <- df$new_end <- NA
df$new_start[1] <- df$start[1]
df$new_end[1] <- df$end[1]
for( i in 2:nrow(df)) {
  if(df$sp_row[i]==1) {
    df$new_start[i] <- df$new_end[i-1]+1
    df$new_end[i] <- df$new_start[i]+df$end[i]-df$start[i]
  }
  if(df$sp_row[i]!=1) {
    df$new_start[i] <- df$start[i]-df$new_end[i-1]
    df$new_end[i] <- df$new_start[i]+df$end[i]-df$start[i]
  }
}
# A tibble: 5 x 6
  Sp1   start   end new_start new_end sp_row
  <chr> <int> <dbl>     <dbl>   <dbl>  <int>
1 A       100  1077       100    1077      1
2 B      2316  4088      1078    2850      1
3 B     26647 28746     23797   25896      2
4 B     50000 60000     24104   34104      3
5 C       450   789     34105   34444      1

There is at least one mistake in your example btw: 50000-25896 = 29053 is wrong.

Using pmap with c(...) part 2

The issue seems to be mixing the custom anonymous/lambda function (function(Weight, Days, ...) - where the arguments are named as the same as the column name) with the default lambda function (~ - where the arguments are .x, .y if only two elements or if more than two - ..1, ..2, ..3 etc). In the OP's code

library(dplyr)
library(purrr)
df %>% 
   mutate(pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days))))

The 'Weight', 'Days' returns the full column values from original dataset and not from rows. If we want to still make use of the above command, we need to convert the data captured in each row to a tibble and use with

df %>%
     pmap_dfr(., ~ with(as_tibble(list(...)), 
             setNames(rep(Weight, Days), seq_len(Days))))
# A tibble: 3 x 7
#     `1`   `2`   `3`   `4`   `5`   `6`   `7`
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1  0.03  0.03  0.03  0.03  0.03 NA    NA   
#2  0.02  0.02  0.02  0.02  0.02  0.02  0.02
#3  0.04  0.04  0.04 NA    NA    NA    NA

If we want the other columns,

df %>%
     pmap_dfr(., ~ c(list(...)[-(3:4)], with(as_tibble(list(...)), 
             setNames(rep(Weight, Days), seq_len(Days)))))
# A tibble: 3 x 9
#  Name    School     `1`   `2`   `3`   `4`   `5`   `6`   `7`
#  <chr>   <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach      0.03  0.03  0.03  0.03  0.03 NA    NA   
#2 Antoine Ken       0.02  0.02  0.02  0.02  0.02  0.02  0.02
#3 Barbara Franklin  0.04  0.04  0.04 NA    NA    NA    NA

Or use rowwise

library(tidyr)
df %>% 
    rowwise %>% 
    mutate(out = list(setNames(rep(Weight, Days), seq_len(Days)))) %>%
    ungroup %>%
    unnest_wider(c(out))  %>%
    select(-Weight, -Days)
# A tibble: 3 x 9
#  Name    School     `1`   `2`   `3`   `4`   `5`   `6`   `7`
#  <chr>   <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach      0.03  0.03  0.03  0.03  0.03 NA    NA   
#2 Antoine Ken       0.02  0.02  0.02  0.02  0.02  0.02  0.02
#3 Barbara Franklin  0.04  0.04  0.04 NA    NA    NA    NA

Fast way to calculate values in cells based on values in previous rows

Though My friend's output/strategy is fabulous, but since we cannot have two input vectors in baseR's Reduce() so I used this trick-

Generated fresh values of var1 in data.frame() inside the Reduce()
Where you want to use current values of var1 use .y
where previous values were to be used use .x$var1 instead.
used formula where I'd require to use current generated value of any variable.
rest is pretty clear I think.
accumulate = TRUE is obvious becuase you want all intermediate values.
Since output here will be a list, that is rbind using do.call

In base R you can do

do.call(rbind, Reduce(function(.x, .y) {data.frame(var1 = .y,
                            var2 = .x$var2 + .x$var3 -.y/constant,
                            var3 = .x$var1 + 0.1 * (.x$var2 + .x$var3 -.y/constant)/constant)}, 
       dt$var1[-1], 
       init =  data.frame(var1 = dt$var1[1], var2 = -3.12, var3 = 1),
       accumulate = TRUE))

        var1          var2      var3
1 -92186.747 -3.120000e+00      1.00
2 -19163.504 -2.088501e+00 -92186.75
3 -18178.840 -9.218881e+04 -19163.52
4  -9844.679 -1.113523e+05 -18178.86
5 -16494.780 -1.295311e+05  -9844.70
6 -17088.058 -1.393758e+05 -16494.80

which can be emulated in tidyverse/purrr as follows

library(purrr)
accumulate(dt$var1[-1], .init = data.frame(var1 = dt$var1[1], var2 = -3.12, var3 = 1), 
           ~ data.frame(var1 = .y,
                        var2 = .x$var2 + .x$var3 -(.y/constant),
                        var3 = .x$var1 + 0.1 * (.x$var2 + .x$var3 -(.y/constant))/constant)) %>% map_df(~.x)

        var1          var2      var3
1 -92186.747 -3.120000e+00      1.00
2 -19163.504 -2.088501e+00 -92186.75
3 -18178.840 -9.218881e+04 -19163.52
4  -9844.679 -1.113523e+05 -18178.86
5 -16494.780 -1.295311e+05  -9844.70
6 -17088.058 -1.393758e+05 -16494.80

R: pass multiple arguments to accumulate/reduce

If we want to use accumulate2, then specify the arguments correctly i.e. it takes two input arguments as 'pw' and 'add' and an initialization argument which would be the first value of 'x'. As it is a grouped by 'ID', do the grouping before we do the accumulate2, extract the lambda default arguments ..1, ..2 and ..3 respectively in that order and create the recursive function based on this

library(dplyr)
library(purrr)
out <- df %>%
   group_by(ID) %>% 
   mutate(x1 = accumulate2(pw[-1], add[-1], ~  ..1^..2 + ..3, 
             .init = first(x)) %>%
                flatten_dbl ) %>%
   ungroup

out$x1
#[1]    1                   2                  11   
#[4]    4                1031 1201024845477409792

With more than 3 arguments, a for loop would be better

# // initialize an empty vector
out <- c()
# // loop over the `unique` ID
for(id in  unique(df$ID)) {
    # // create a temporary subset of data based on that id
    tmp_df <- subset(df, ID == id)
     # // initialize a temporary storage output
     tmp_out <- numeric(nrow(tmp_df))
     # // initialize first value with the first element of x
     tmp_out[1] <- tmp_df$x[1]
    # // if the number of rows is greater than 1
    if(nrow(tmp_df) > 1) {
       // loop over the rows
      for(i in 2:nrow(tmp_df)) {
        #// do the recursive calculation and update
        tmp_out[i] <- tmp_out[i - 1]^ tmp_df$pw[i] + tmp_df$add[i]
        }
      } 
     
     out <- c(out, tmp_out)

}

out
#[1] 1                   2                  11     
#[4] 4                1031 1201024845477409792

Rolling Computation of Two Simultaneous Variables Iteratively or Rowwise, Using Three Other Given Variables