Chain Arithmetic Operators in Dplyr with %>% Pipe

Chain arithmetic operators in dplyr with %% pipe

Surround the operators with backticks or quotes, and things should work as expected:

1:10 %>%  `*`(2) %>% sum
# [1] 110

1:10 %>% `/`(2) %>% sum
# [1] 27.5

Use pipe operator %% with replacement functions like colnames()-

You could use colnames<- or setNames (thanks to @David Arenburg)

group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
`colnames<-`(c("cyl", "disp_mean", "hp_mean"))
# or
# `names<-`(c("cyl", "disp_mean", "hp_mean"))
# setNames(., c("cyl", "disp_mean", "hp_mean"))

# cyl disp_mean hp_mean
# 1 4 105.1364 82.63636
# 2 6 183.3143 122.28571
# 3 8 353.1000 209.21429

Or pick an Alias (set_colnames) from magrittr:

library(magrittr)
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
set_colnames(c("cyl", "disp_mean", "hp_mean"))

dplyr::rename may be more convenient if you are only (re)naming a few out of many columns (it requires writing both the old and the new name; see @Richard Scriven's answer)

Nested pipe chain in dplyr / left_join

For your code to work, you will need a curly brace around the y argument as shown below

  df %>% left_join(x = ., y = {.} %>% 
distinct(Team, Date) %>%
mutate(Date_Lagged = lag(Date)))

Joining, by = c("Team", "Date")
Team Date Points Date_Lagged
1 A 2016-05-10 1 <NA>
2 A 2016-05-10 4 <NA>
3 A 2016-05-10 3 <NA>
4 A 2016-05-10 2 <NA>
5 B 2016-05-12 1 2016-05-10
6 B 2016-05-12 5 2016-05-10
7 B 2016-05-12 6 2016-05-10
8 C 2016-05-15 1 2016-05-12
9 C 2016-05-15 2 2016-05-12
10 D 2016-05-30 3 2016-05-15
11 D 2016-05-30 9 2016-05-15

oe you can just do

df %>% left_join(df%>% 
distinct(Team, Date) %>%
mutate(Date_Lagged = lag(Date)))

How to do multiplication with magrittr pipes

You need to put * in quotes - "*"(), also use 1 as your argument in prop.table to match the example.

mtcars %>%
xtabs(~ gear + cyl, data = .) %>%
prop.table(., 1) %>%
"*"(100 ) %>% round(.,2)

use if() to use select() within a dplyr pipe chain

You need to make sure that your statement between { returns a data.frame regardless of the condition. So you need an else ..

cond <- FALSE

mtcars %>%
group_by(cyl) %>%
{ if (cond) filter(., am == 1) else . } %>%
summarise(m = mean(wt))

Works fine with TRUE or FALSE.

(Also note that a simple example like this really makes the question a lot more easy to grasp.)

Using table() in dplyr chain

This behavior is by design: https://github.com/tidyverse/magrittr/blob/00a1fe3305a4914d7c9714fba78fd5f03f70f51e/README.md#re-using-the-placeholder-for-attributes

Since you don't have a . on it's own, the tibble is still being passed as the first parameter so it's really more like

... %>% table(., .$type, .$colour)

The official magrittr work-around is to use curly braces

... %>% {table(.$type, .$colour)}

How to reuse parts of long chain of pipe operators in R?

Similar in syntax to desired pseudo-code:

library(dplyr)

subchain <- . %>%
filter(mass > mean(mass, na.rm = TRUE)) %>%
select(name, gender, homeworld)

all.equal(
starwars %>%
group_by(gender) %>%
filter(mass > mean(mass, na.rm = TRUE)) %>%
select(name, gender, homeworld),
starwars %>%
group_by(gender) %>%
subchain()
)

Using a dot . as start of a piping sequence. This is in effect close to function wrapping, but this is called a magrittr functional sequence. See ?functions and try magrittr::functions(subchain)

How to feed pipe into an inequality?

You need to use curly braces if you just want to use pipes.

seq(9) %>% {. > 4}

[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE

I'd recommend using purrr if you're going to be piping these kinds of things, as it will result in a bit more readable code.

library(purrr)

map_lgl(seq(9), ~.x > 4)

[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE

tryCatch() or exists() in R pipe chain error handling

You can use select helper functions, if you want to ignore the selection if the column doesn't exist; Here matches("^1$") will try to select the column whose name exactly matches 1, since the data frame doesn't have the column, it simply ignores the selection as matches returns integer(0):

library(tidyverse)
df %>%
count(Experiment_Batch, Overall) %>%
spread(Overall, n, fill = 0) %>%
select(Experiment_Batch, matches("^1$"))

# A tibble: 6 x 1
# Experiment_Batch
#* <fctr>
#1 008_1
#2 008_6
#3 520_0
#4 944_10
#5 944_8
#6 944_9

matches returns integer(0) when non of the column names matches the pattern which gets ignored in select:

matches("^1$", vars = c("0", "experiment"))
# integer(0)

matches("^1$", vars = c("0", "experiment", "1"))
# [1] 3

If you need to customize the error catch:

library(tidyverse)
df %>%
count(Experiment_Batch, Overall) %>%
spread(Overall, n, fill = 0) %>%
{
tryCatch(
select(., Experiment_Batch, `1`),
error=function(e) select(., Experiment_Batch)
)
}
# replace the error with the customized function to handle the exception

# A tibble: 6 x 1
# Experiment_Batch
#* <fctr>
#1 008_1
#2 008_6
#3 520_0
#4 944_10
#5 944_8
6 944_9


Related Topics



Leave a reply



Submit