Creating a new column based on unique ID with values in r
Here is a base R version:
df = data_frame(ID = c(1124, 1123))
expand.grid(ID = df$ID, Age = 0:5)
## ID Age
## 1 1124 0
## 2 1123 0
## 3 1124 1
## 4 1123 1
## 5 1124 2
## 6 1123 2
## 7 1124 3
## 8 1123 3
## 9 1124 4
## 10 1123 4
## 11 1124 5
## 12 1123 5
This is sorted differently from the tidyr::expand
result.
EDIT
As @thelatemail suggested, you can do the following to avoid renaming df
expand.grid(c(Age=list(0:5), df))
or
merge(df, list(Age=0:5))
EDIT 2
Here is a data.table
example:
library(data.table)
setDT(df) # Convert df to a data.table.
df[, do.call(CJ, list(ID = ID, Age = 0:5))]
For large data sets, one might want to benchmark the various methods.
Group dataframe rows by creating a unique ID column based on the amount of time passed between entries and variable values
Here's a dplyr approach that calculates the gap and rolling avg gap within each Name/Item group, then flags large gaps, and assigns a new group for each large gap or change in Name or Item.
df1 %>%
group_by(Name,Item) %>%
mutate(purch_num = row_number(),
time_since_first = Date - first(Date),
gap = Date - lag(Date, default = as.Date(-Inf)),
avg_gap = time_since_first / (purch_num-1),
new_grp_flag = gap > 180 | gap > 3*avg_gap) %>%
ungroup() %>%
mutate(group = cumsum(new_grp_flag))
Add unique ID column based on values in two other columns (lat, long)
We could use match
transform(d, Cluster_ID = match(paste0(LAT, LONG), unique(paste0(LAT, LONG))))
Or convert the 'LAT', 'LONG' to sequence and then do the interaction
transform(d, Cluster_ID = as.integer(interaction(match(LAT,
unique(LAT)), match(LONG, unique(LONG)), drop=TRUE, lex.order = FALSE)))
Create a new column with unique identifier for each group
Try with groupby ngroup
+ 1, use sort=False
to ensure groups are enumerated in the order they appear in the DataFrame:
df['idx'] = df.groupby(['ID', 'phase'], sort=False).ngroup() + 1
df
:
ID phase side values idx
0 r1 ph1 l 12 1
1 r1 ph1 r 34 1
2 r1 ph2 l 93 2
3 s4 ph3 l 21 3
4 s3 ph2 l 88 4
5 s3 ph2 r 54 4
Creating a new data frame in R based on unique values and time stamp
You can do:
df <- data.frame(ID = c(234, 546, 678, 546, 234),
PRIORITY = c("Reading", "Writing", "Communication", "Communication", "Writing"),
TIME = c("10/29", "10/30", "10/29", "11/1", "11/1"))
library(tidyverse)
df %>%
group_by(ID) %>%
mutate(ID_count = 1:n()) %>%
ungroup() %>%
pivot_wider(id_cols = ID,
values_from = c(PRIORITY, TIME),
names_from = ID_count)
which gives:
# A tibble: 3 x 5
ID PRIORITY_1 PRIORITY_2 TIME_1 TIME_2
<dbl> <chr> <chr> <chr> <chr>
1 234 Reading Writing 10/29 11/1
2 546 Writing Communication 10/30 11/1
3 678 Communication <NA> 10/29 <NA>
How to create a new column based on flag on a different column R
we can use dplyr
package
df |> group_by(id) |>
mutate(base_value = result[which(flag == "Y")] ,
percentage_change = (result - base_value)/base_value * 100) |>
ungroup()
- output
# A tibble: 8 × 5
id result flag base_value percentage_change
<dbl> <dbl> <chr> <dbl> <dbl>
1 1 12 "" 13 -7.69
2 1 33 "" 13 153.84
3 1 13 "Y" 13 0
4 1 44 "" 13 238.46
5 2 23 "Y" 23 0
6 2 44 "" 23 91.3
7 2 52 "" 23 126.08
8 2 11 "" 23 -52.17
Related Topics
Calculate Max Value Across Multiple Columns by Multiple Groups
Creating Grouped Bar-Plot of Multi-Column Data in R
Remove Total Value for One Column in Powerbi
Minimum (Or Maximum) Value of Each Row Across Multiple Columns
Drop Unused Factor Levels in a Subsetted Data Frame
Drop Data Frame Columns by Name
Finding Local Maxima and Minima
Error: Unexpected Symbol/Input/String Constant/Numeric Constant/Special in My Code
Is There an R Function For Finding the Index of an Element in a Vector
Add a Common Legend For Combined Ggplots
R: Pulling Data from One Column to Create New Columns
Concatenate String Columns and Order in Alphabetical Order
Duplicate Columns in Spark Dataframe
How to Generate the First N Terms in the Series:
R Memory Management/Cannot Allocate Vector of Size N Mb
How to Disable Scientific Notation
Interpreting "Condition Has Length ≫ 1" Warning from 'If' Function