Meaning of Error Using . Shorthand Inside Dplyr Function

Meaning of error using . shorthand inside dplyr function

As @aosmith noted in the comments it's due to the way magrittr parses the dot in this case :

from ?'%>%':

Using the dot-place holder as lhs

When the dot is used as lhs, the
result will be a functional sequence, i.e. a function which applies
the entire chain of right-hand sides in turn to its input.

To avoid triggering this, any modification of the expression on the lhs will do:

df %>%
mutate(name = str_to_lower(name)) %>%
bind_rows((.) %>% mutate(name = "New England"))

df %>%
mutate(name = str_to_lower(name)) %>%
bind_rows({.} %>% mutate(name = "New England"))

df %>%
mutate(name = str_to_lower(name)) %>%
bind_rows(identity(.) %>% mutate(name = "New England"))

Here's a suggestion that avoid the problem altogether:

df %>%
# arbitrary piped operation
mutate(name = str_to_lower(name)) %>%
replicate(2,.,simplify = FALSE) %>%
map_at(2,mutate_at,"name",~"New England") %>%
bind_rows

# # A tibble: 12 x 2
# name estimate
# <chr> <dbl>
# 1 ct 501074
# 2 ma 1057316
# 3 me 47369
# 4 nh 76630
# 5 ri 141206
# 6 vt 27464
# 7 New England 501074
# 8 New England 1057316
# 9 New England 47369
# 10 New England 76630
# 11 New England 141206
# 12 New England 27464

Understand the warning message in across in R

There is not much difference between using where and not using it. It just shows a warning to suggest a better syntax. Basically where takes a predicate function and apply it on every variable (column) of your data set. It then returns every variable for which the function returns TRUE. The following examples are taken from the documentations of where:

iris %>% select(where(is.numeric))
# or an anonymous function
iris %>% select(where(function(x) is.numeric(x)))
# or a purrr style formula as a shortcut for creating a function on the spot
iris %>% select(where(~ is.numeric(.x)))

Or you can also have two conditions using shorthand &&:

# The following code selects are numeric variables whose means are greater thatn 3.5
iris %>% select(where(~ is.numeric(.x) && mean(.x) > 3.5))

You can use select(where(is.character)) for .cols argument of the across function and then apply a function in .fns argument on the selected columns.
For more information you can always refer to documentations which are the best source to learn more about these materials.

What does the dplyr period character . reference?

The dot is used within dplyr mainly (not exclusively) in mutate_each, summarise_each and do. In the first two (and their SE counterparts) it refers to all the columns to which the functions in funs are applied. In do it refers to the (potentially grouped) data.frame so you can reference single columns by using .$xyz to reference a column named "xyz".

The reasons you cannot run

filter(df, . == 5)

is because a) filter is not designed to work with multiple columns like mutate_each for example and b) you would need to use the pipe operator %>% (originally from magrittr).

However, you could use it with a function like rowSums inside filter when combined with the pipe operator %>%:

> filter(mtcars, rowSums(. > 5) > 4)
Error: Objekt '.' not found

> mtcars %>% filter(rowSums(. > 5) > 4) %>% head()
lm cyl disp hp drat wt qsec vs am gear carb
1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
3 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
4 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
5 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
6 14.3 8 360 245 3.21 3.570 15.84 0 0 3 4

You should also take a look at the magrittr help files:

library(magrittr)
help("%>%")

From the help page:

Placing lhs elsewhere in rhs call
Often you will want lhs to the rhs call at another position than the first. For this purpose you can use the dot (.) as placeholder. For example, y %>% f(x, .) is equivalent to f(x, y) and z %>% f(x, y, arg = .) is equivalent to f(x, y, arg = z).

Using the dot for secondary purposes
Often, some attribute or property of lhs is desired in the rhs call in addition to the value of lhs itself, e.g. the number of rows or columns. It is perfectly valid to use the dot placeholder several times in the rhs call, but by design
the behavior is slightly different when using it inside nested
function calls. In particular, if the placeholder is only used in a
nested function call, lhs will also be placed as the first argument!
The reason for this is that in most use-cases this produces the most
readable code. For example, iris %>% subset(1:nrow(.) %% 2 == 0) is
equivalent to iris %>% subset(., 1:nrow(.) %% 2 == 0) but slightly
more compact. It is possible to overrule this behavior by enclosing
the rhs in braces. For example, 1:10 %>% {c(min(.), max(.))} is
equivalent to c(min(1:10), max(1:10)).

What is the meaning of the `~` operator in the tidyverse context?

Most commonly, it's a shorthand way of writing an anonymous function.

map_dbl(HEIGHT, ~ sum(.x, 5))

is the same as

map_dbl(HEIGHT, function(.x){sum(.x, 5))

It has other meanings in other contexts. E.g., at the R> prompt, type

? case_when 

to see how it uses ~.

Problem programming with dplyr--column which is definitely a vector being picked up as a formula

{{a}} is shorthand for !!enquo(a), which captures the expression provided to a as well as the context where this expression should be evaluated. In your case, the context is the data frame, which is already being provided to the function. So, a better rlang verb to use here is ensym(a), which captures the symbol name provided to a instead:

plot_high_chart <- function(.data,
chart_type = "column",
x_value = "Year", # <-- Note: strings
y_value = "total",
group_value = "service") {
.data %>%
hchart(chart_type, hcaes(x = !!rlang::ensym(x_value), # <- ensym instead of {{
y = !!rlang::ensym(y_value),
group = !!rlang::ensym(group_value)))
}

As a bonus, the function will now work with symbols AND with strings:

data %>%
plot_high_chart(x_value= "Year", y_value= "total", group_value= "service") # Works
data %>%
plot_high_chart(x_value= Year, y_value= total, group_value= service) # Also Works

pivot_longer gives error when using dtplyr

Dtplyr version 1.2.0 is now available on CRAN, which means this issue is now resolved!

For anyone experiencing this error, check/update your version of dtplyr to ensure you are running >=1.2.0:

install.packages("dtplyr")

(NB. this isn't updated as part of the tidyverse packages so make sure to do it separately)

https://www.tidyverse.org/blog/2021/12/dtplyr-1-2-0/

https://cran.r-project.org/web/packages/dtplyr/index.html

Use dplyr's _if() functions like mutate_if() with a negative predicate function

We can use shorthand notation ~ for anonymous function in tidyverse

library(dplyr)
iris %>%
mutate_if(~ !is.numeric(.), as.character)

Or without anonymous function, use negate from purrr

library(purrr)
iris %>%
mutate_if(negate(is.numeric), as.character)

In addition to negate, Negate from base R also works

iris %>%
mutate_if(Negate(is.numeric), as.character)

Same notation, works with select_if/arrange_if

iris %>%
select_if(negate(is.numeric))%>%
head(2)
# Species
#1 setosa
#2 setosa


Related Topics



Leave a reply



Submit