How do I subset column variables in DF1 based on the important variables I got in DF2?
I can't find a dupe so here goes- simply subset by the values of as.character(df1$ID)
as in
df2[as.character(df1$ID)] ## Or just `df2[df1$ID]` if its already a character
# x1 x2 x5
# 1 1 11 41
# 2 2 12 42
# 3 3 13 43
# 4 4 14 44
# 5 5 15 45
The reason for as.character
is in order to avoid sub-setting by df1$ID
underlying storage mode (integer) rather by it's levels
Though this question is tagged with data.table
, so we could also do this by reference (if we have a data.table
)- no need to convert to character
setDT(df2)[, setdiff(names(df2), df1$ID) := NULL]
df2
# x1 x2 x5
# 1: 1 11 41
# 2: 2 12 42
# 3: 3 13 43
# 4: 4 14 44
# 5: 5 15 45
How to create a dummy in dataframe according to value in another dataframe with a different length of observations in R?
Does this work:
library(dplyr)
df2 %>% rename('df2_year' = year) %>% left_join(df1, by = 'id') %>% group_by(id) %>% mutate(dummy = if_else(year >= df2_year, 1, 0)) %>% select(-df2_year)
# A tibble: 6 x 4
# Groups: id [2]
id year x1 dummy
<int> <int> <dbl> <dbl>
1 1 2017 0.3 0
2 1 2018 0.5 0
3 1 2019 0.45 1
4 1 2020 0.5 1
5 1 2021 0.6 1
6 2 NA NA NA
Data used:
df1
id year x1
1 1 2017 0.30
2 1 2018 0.50
3 1 2019 0.45
4 1 2020 0.50
5 1 2021 0.60
df2
id year
1 1 2019
2 2 2020
- id = 2 is missing in df1 in your sample data.
Creating new variable in dataframe based on matching values from other dataframe
I think this does what you want:
df1$z <- df2$b[match(df1$x,df2$a)]
df1$z[df1$x=='G']=NA
Output:
> df1
x z
1 A 1
2 <NA> NA
3 L NA
4 G 7
5 C 3
6 F 6
7 <NA> NA
8 J 10
9 G 7
10 K NA
Hope this helps!
Replacing column names with another data frame if matches
If you are open to a tidyverse
solution, you could use
library(dplyr)
library(tibble)
df %>%
rename_with(~deframe(df2)[.x], .cols = df2$Name) %>%
select(Name, Reference, any_of(df2$Adjusted_Name))
This returns
# A tibble: 3 x 6
Name Reference good_run very_great_work bad_run fair_run_decent
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 George Hill 34 21 33 21
2 Frank Stairs 29 30 29 28
3 Bertha Trail 25 21 24 25
Data
df <- structure(list(Name = c("George", "Frank", "Bertha"), Reference = c("Hill",
"Stairs", "Trail"), Good = c(34, 29, 25), Fair = c(21, 28, 25
), Bad = c(33, 29, 24), Great = c(21, 30, 21), Poor = c(32, 29,
26)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA,
-3L), spec = structure(list(cols = list(Name = structure(list(), class = c("collector_character",
"collector")), Reference = structure(list(), class = c("collector_character",
"collector")), Good = structure(list(), class = c("collector_double",
"collector")), Fair = structure(list(), class = c("collector_double",
"collector")), Bad = structure(list(), class = c("collector_double",
"collector")), Great = structure(list(), class = c("collector_double",
"collector")), Poor = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
df2 <- structure(list(Name = c("Good", "Great", "Bad", "Fair"), Adjusted_Name = c("good_run",
"very_great_work", "bad_run", "fair_run_decent")), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
cols = list(Name = structure(list(), class = c("collector_character",
"collector")), Adjusted_Name = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
Perform division within single column where classes are identical
Here is an approach that uses data.table::rleid()
library(data.table)
df %>%
mutate(gp:=class %in% c('A','B')) %>%
arrange(class2,class) %>%
group_by(id = rleid(class2,gp)) %>%
mutate(result=value/value[class %in% c('A','C')]) %>%
select(-gp,-id)
A data.table only approach would be:
setDT(df)[,gp:=class %chin% c('A','B')][
order(class2,class),result:=value/value[class %chin% c('A','C')],by=.(rleid(class2,gp))][
,gp:=NULL][]
Output:
id value class class2 desired.operation result
<int> <dbl> <chr> <chr> <chr> <dbl>
1 1 1 A W 1/1 1
2 1 5 B W 5/1 5
3 2 9 C W 9/9 1
4 2 13 D W 13/9 1.44
5 3 2 A X 2/2 1
6 3 6 B X 6/2 3
7 4 10 C X 10/10 1
8 4 14 D X 14/10 1.4
9 5 3 A Y 3/3 1
10 5 7 B Y 7/3 2.33
11 6 11 C Y 11/11 1
12 6 15 D Y 15/11 1.36
13 7 4 A Z 4/4 1
14 7 8 B Z 8/4 2
15 8 12 C Z 12/12 1
16 8 16 D Z 16/12 1.33
Related Topics
Place Text Values to Right of Sankey Diagram
Ggplot: How to Produce a Gradient Fill Within a Geom_Polygon
How to Pop Up the Graphics Window from Rscript
Placement of Error Bars in Barplot Using Ggplot2
Plotly - Different Colours for Different Surfaces
R: Reading a Binary File That Is Zipped
Increase Space Between Legend Keys Without Increasing Legend Keys
Why Does Withcallinghandlers Still Stops Execution
Get Value of Last Non-Na Row Per Column in Data.Table
How to Set Bin Width with Geom_Bar Stat="Identity" in a Time Series Plot
Cant Create File Name with Time Stamp
Fread and a Quoted Multi-Line Column Value
Variable Results with Dplyr Summarise, Depending on Output Variable Naming
Object Not Found Error with Ggplot2 When Adding Shape Aesthetic
How to Plot a Boxplot with Correctly Spaced Continuous X-Axis Values in Ggplot2