Dplyr Pipes - How to Change the Original Dataframe

dplyr pipes - How to change the original dataframe

You can definitely do the assignment by using an idiom such as df <- df %>% ... or df %>% ... -> df. But you could also avoid redundancy (i.e., stating df twice) by using the magrittr compound assignment operator %<>% at the beginning of the pipe.

From the magrittr vignette:

The compound assignment pipe operator %<>% can be used as the first pipe in a chain. The effect will be that the result of the pipeline is assigned to the left-hand side object, rather than returning the result as usual.

So with your code, we can do

library(magrittr)  ## came with your dplyr install
df %<>% slice(-(1:3)) %>% select(-c(Col1, Col50, Col51))

This pipes df into the expression and updates df as the result.

Update: In the comments you note an issue setting the column names. Fortunately magrittr has provided functions for setting attributes in a pipe. Try the following.

df %<>% 
set_colnames(sprintf("Col%d", 1:ncol(.))) %>%
slice(-(1:3)) %>%
select(-c(Col1,Col50,Col51))

Note that since we have a data frame, we can also use setNames() (stats) or set_names() (magrittr) in place of set_colnames().


Thanks to Steven Beaupre for adding the note from the vignette.

Change the column values withing dplyr pipes

Another option could be:

mtcars %<>%
mutate_at(vars(1), ~ !!select(., 2) %>% pull() * 4)

mpg cyl disp hp drat wt qsec vs am gear carb
1 24 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2 24 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3 16 4 108.0 93 3.85 2.320 18.61 1 1 4 1
4 24 6 258.0 110 3.08 3.215 19.44 1 0 3 1
5 32 8 360.0 175 3.15 3.440 17.02 0 0 3 2
6 24 6 225.0 105 2.76 3.460 20.22 1 0 3 1
7 32 8 360.0 245 3.21 3.570 15.84 0 0 3 4
8 16 4 146.7 62 3.69 3.190 20.00 1 0 4 2
9 16 4 140.8 95 3.92 3.150 22.90 1 0 4 2
10 24 6 167.6 123 3.92 3.440 18.30 1 0 4 4

How to pipe in dplyr

Why not just check if theyre in there and filter them:

mdat %>% filter( sample %in% dge$samples$sample )

It's easier to understand and controll than a join and performance shouldn't be an issue.

How to update values in a dplyr pipe?

This can be taken care of with a nested ifelse statement, i.e,

library(dplyr)

dataset0 %>%
mutate(v1 = ifelse(people %in% c('father', 'mother', 'parents'), 'parents',
ifelse(people %in% c('girl', 'boy', 'children'), 'children', 'grandparents')))

# people v1
#1 father parents
#2 parents parents
#3 father parents
#4 children children
#5 girl children
#6 boy children
#7 grand father grandparents
#8 grand mother grandparents
#9 grandparents grandparents

How to rename all columns in a dataframe after a specified row, using dplyr pipe

We can use row_to_names from janitor

library(janitor)
library(dplyr)
df %>%
row_to_names(row_number = 1) %>%
type.convert(as.is = TRUE) %>%
as_tibble
# A tibble: 3 x 4
# H1 H2 H3 H4
# <dbl> <dbl> <dbl> <dbl>
#1 0.9 2.17 2.59 2.24
#2 4.1 4.1 3.8 3.8
#3 4 4.1 4.1 4.09

Change values of a column conditionally using pipe() function

Two approaches :

Either use as.character in ifelse

library(dplyr)
df %>% mutate(Col3 = ifelse(Col1 == "a", "aa", as.character(Col1)))

Or use stringsAsFactors = FALSE while constructing the dataframe.

df = data.frame("Col1" = letters[1:6], "Col2" = 1:6, stringsAsFactors = FALSE)

Transform data to data.frame with the pipe operator

After the transpose, convert to tibble with as_tibble and change the column names with set_names

library(dplyr)
library(tibble)
x %>%
t %>%
as_tibble(.name_repair = "unique") %>%
setNames(c("a", "b"))
# A tibble: 1 x 2
# a b
# <int> <int>
#1 1 2

Or another option if we want to use the OP's syntax would be to wrap the code with {}

x %>%
{data.frame(a = .[1], b = .[2])}

dplyr mutate in place

We can use the %<>% compound assignment operator from magrittr to change in place

library(magrittr)
df_test %<>%
mutate(a = round(a,0))

If we are using data.table, the assignment (:=) operator does this in place too without copying

library(data.table)
setDT(df_test)[, a := round(a,0)]

Extracting outputs from for loops with dplyr pipes into dataframe in R

something like this?

bind_rows(lapply(c("am", "vs"), function(i) {
mtcars %>%
t_test(formula(paste0("mpg ~ ",i)),detailed=T) %>%
mutate(var = i)
}))

Output:

# A tibble: 2 × 16
estimate estimate1 estimate2 .y. group1 group2 n1 n2 statistic p df conf.low conf.high method alternative var
<dbl> <dbl> <dbl> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 -7.24 17.1 24.4 mpg 0 1 19 13 -3.77 0.00137 18.3 -11.3 -3.21 T-test two.sided am
2 -7.94 16.6 24.6 mpg 0 1 18 14 -4.67 0.00011 22.7 -11.5 -4.42 T-test two.sided vs


Related Topics



Leave a reply



Submit