Creating a New Column Based on Unique Id With Values in R

Creating a new column based on unique ID with values in r

Here is a base R version:

df = data_frame(ID = c(1124, 1123))
expand.grid(ID = df$ID, Age = 0:5)

##      ID Age
## 1  1124   0
## 2  1123   0
## 3  1124   1
## 4  1123   1
## 5  1124   2
## 6  1123   2
## 7  1124   3
## 8  1123   3
## 9  1124   4
## 10 1123   4
## 11 1124   5
## 12 1123   5

This is sorted differently from the tidyr::expand result.

EDIT

As @thelatemail suggested, you can do the following to avoid renaming df

expand.grid(c(Age=list(0:5), df))

merge(df, list(Age=0:5))

EDIT 2

Here is a data.table example:

library(data.table)
setDT(df) # Convert df to a data.table.
df[, do.call(CJ, list(ID = ID, Age = 0:5))]

For large data sets, one might want to benchmark the various methods.

Group dataframe rows by creating a unique ID column based on the amount of time passed between entries and variable values

Here's a dplyr approach that calculates the gap and rolling avg gap within each Name/Item group, then flags large gaps, and assigns a new group for each large gap or change in Name or Item.

df1 %>%
  group_by(Name,Item) %>%
  mutate(purch_num = row_number(),
         time_since_first = Date - first(Date),
         gap = Date - lag(Date, default = as.Date(-Inf)),
         avg_gap = time_since_first / (purch_num-1),
         new_grp_flag = gap > 180 | gap > 3*avg_gap) %>%
  ungroup() %>%
  mutate(group = cumsum(new_grp_flag))

Add unique ID column based on values in two other columns (lat, long)

We could use match

transform(d, Cluster_ID = match(paste0(LAT, LONG), unique(paste0(LAT, LONG))))

Or convert the 'LAT', 'LONG' to sequence and then do the interaction

transform(d, Cluster_ID = as.integer(interaction(match(LAT, 
  unique(LAT)),  match(LONG, unique(LONG)), drop=TRUE, lex.order = FALSE)))

Create a new column with unique identifier for each group

Try with groupby ngroup + 1, use sort=False to ensure groups are enumerated in the order they appear in the DataFrame:

df['idx'] = df.groupby(['ID', 'phase'], sort=False).ngroup() + 1

df:

   ID phase side  values  idx
0  r1   ph1    l      12    1
1  r1   ph1    r      34    1
2  r1   ph2    l      93    2
3  s4   ph3    l      21    3
4  s3   ph2    l      88    4
5  s3   ph2    r      54    4

Creating a new data frame in R based on unique values and time stamp

You can do:

df <- data.frame(ID = c(234, 546, 678, 546, 234),
                 PRIORITY = c("Reading", "Writing", "Communication", "Communication", "Writing"),
                 TIME = c("10/29", "10/30", "10/29", "11/1", "11/1"))

library(tidyverse)

df %>%
  group_by(ID) %>%
  mutate(ID_count = 1:n()) %>%
  ungroup() %>%
  pivot_wider(id_cols = ID,
              values_from = c(PRIORITY, TIME),
              names_from = ID_count)

which gives:

# A tibble: 3 x 5
     ID PRIORITY_1    PRIORITY_2    TIME_1 TIME_2
  <dbl> <chr>         <chr>         <chr>  <chr> 
1   234 Reading       Writing       10/29  11/1  
2   546 Writing       Communication 10/30  11/1  
3   678 Communication <NA>          10/29  <NA>

How to create a new column based on flag on a different column R

we can use dplyr package

df |> group_by(id) |> 
    mutate(base_value = result[which(flag == "Y")] ,
    percentage_change = (result - base_value)/base_value * 100) |>
    ungroup()

output

# A tibble: 8 × 5
     id result flag  base_value percentage_change
  <dbl>  <dbl> <chr>      <dbl>             <dbl>
1     1     12 ""            13             -7.69
2     1     33 ""            13            153.84  
3     1     13 "Y"           13              0   
4     1     44 ""            13            238.46  
5     2     23 "Y"           23              0   
6     2     44 ""            23             91.3 
7     2     52 ""            23            126.08  
8     2     11 ""            23            -52.17

Creating a New Column Based on Unique Id With Values in R