R Self Reference

R self reference

Try package data.table and its := operator. It's very fast and very short.

DT[col1==something, col2:=col3+1]

The first part col1==something is the subset. You can put anything here and use the column names as if they are variables; i.e., no need to use $. Then the second part col2:=col3+1 assigns the RHS to the LHS within that subset, where the column names can be assigned to as if they are variables. := is assignment by reference. No copies of any object are taken, so is faster than <-, =, within and transform.

Also, soon to be implemented in v1.8.1, one end goal of j's syntax allowing := in j like that is combining it with by, see question: when should I use the := operator in data.table.

UDPDATE : That was indeed released (:= by group) in July 2012.

Self reference when indexing into a vector

You can use pipes which allow self-referencing with .:

library(pipeR)
my.vector.with.a.long.name %>>% `[`(.>5)
[1]  6  7  8  9 10
my.vector.with.a.long.name %>>% `[`(.%%2==0)
[1]  2  4  6  8 10

Self referencing calculation in group by in r

You can't reference the variable you're creating in mutate. Luckily, the variable being created in this case can be created with cumsum instead.

df %>% group_by(group,level) %>% mutate(v2 = cumsum(v1))

How to produce a self-referencing variable in R (e.g., index levels given returns)?

As a workaround, you can use following trick in edited circumstances. Note you may change this for any number of simultaneous series

I just added an extra group_by statement based on a modulo sequence of required number of variables using seq(n()) %% 2

set.seed(13)
dt <- data.frame(id = rep(letters[1:2], each = 5), time = rep(1:5, 2), ret = rnorm(10)/100)
dt$ind <- ifelse(dt$time == 1, 120, ifelse(dt$time == 2, 125, as.numeric(NA)))
library(dplyr, warn.conflicts = F)

dt %>% group_by(id) %>%
  group_by(d = seq(n()) %% 2, .add = TRUE) %>%
  mutate(ind = cumprod(1 + duplicated(id) * ret)* ind[1])
#> # A tibble: 10 x 5
#> # Groups:   id, d [4]
#>    id     time      ret   ind     d
#>    <chr> <int>    <dbl> <dbl> <dbl>
#>  1 a         1  0.00554  120      1
#>  2 a         2 -0.00280  125      0
#>  3 a         3  0.0178   122.     1
#>  4 a         4  0.00187  125.     0
#>  5 a         5  0.0114   124.     1
#>  6 b         1  0.00416  120      0
#>  7 b         2  0.0123   125      1
#>  8 b         3  0.00237  120.     0
#>  9 b         4 -0.00365  125.     1
#> 10 b         5  0.0111   122.     0

OLD answer: Without using `purrr`

library(tidyverse)

set.seed(13)
dt <- data.frame(id = rep(letters[1:2], each = 4), time = rep(1:4, 2), ret = rnorm(8)/100)
dt$ind <- if_else(dt$time == 1, 100, as.numeric(NA))
dt
#>   id time          ret ind
#> 1  a    1  0.005543269 100
#> 2  a    2 -0.002802719  NA
#> 3  a    3  0.017751634  NA
#> 4  a    4  0.001873201  NA
#> 5  b    1  0.011425261 100
#> 6  b    2  0.004155261  NA
#> 7  b    3  0.012295066  NA
#> 8  b    4  0.002366797  NA

dt %>% group_by(id) %>%
  mutate(ind = cumprod(1 + duplicated(id) * ret)* ind[1])
#> # A tibble: 8 x 4
#> # Groups:   id [2]
#>   id     time      ret   ind
#>   <chr> <int>    <dbl> <dbl>
#> 1 a         1  0.00554 100  
#> 2 a         2 -0.00280  99.7
#> 3 a         3  0.0178  101. 
#> 4 a         4  0.00187 102. 
#> 5 b         1  0.0114  100  
#> 6 b         2  0.00416 100. 
#> 7 b         3  0.0123  102. 
#> 8 b         4  0.00237 102.

^{Created on 2021-07-27 by the reprex package (v2.0.0)}

Generate self reference key within the table using R mutate in a dataframe

The Person_Id fields in your examples don't match.

I'm not sure if this is what you're after, but from your dput() I have created a file that removes the last column:

df_input <- df_output %>% 
  select(-Preceding_visit_id)

Then done this:

df_input %>% 
  group_by(Person_Id) %>% 
  mutate(Preceding_visit_id = lag(Visit_Id))

And the output is this:

# A tibble: 14 x 4
# Groups:   Person_Id [3]
   Person_Id Visit_Id Purpose Preceding_visit_id
       <dbl>    <dbl> <chr>                <dbl>
 1         1        1 checkup                 NA
 2         1        2 checkup                  1
 3         1        3 checkup                  2
 4         1        4 checkup                  3
 5         1        5 checkup                  4
 6         2        6 checkup                 NA
 7         2        7 checkup                  6
 8         2        8 checkup                  7
 9         2        9 checkup                  8
10         2       10 checkup                  9
11         2       11 checkup                 10
12         3       12 checkup                 NA
13         3       13 checkup                 12
14         3       14 checkup                 13

R Self Reference

R self reference

Self reference when indexing into a vector

Self referencing calculation in group by in r

How to produce a self-referencing variable in R (e.g., index levels given returns)?

OLD answer: Without using `purrr`

Generate self reference key within the table using R mutate in a dataframe

Related Topics

Leave a reply

R self reference

Self reference when indexing into a vector

Self referencing calculation in group by in r

How to produce a self-referencing variable in R (e.g., index levels given returns)?

OLD answer: Without using purrr

Generate self reference key within the table using R mutate in a dataframe

Related Topics

Leave a reply

OLD answer: Without using `purrr`