Dplyr: "Error in N(): Function Should Not Be Called Directly"

dplyr: Error in n(): function should not be called directly

I presume you have dplyr and plyr loaded in the same session. dplyr is not plyr. ddply is not a function in the dplyr package.

Both dplyr and plyr have the functions summarise/summarize.

Look at the results of conflicts() to see masked objects.

dplyr::n() returns Error: This function should not be called directly

So, I do not really have a problem, I can just avoid [writing dplyr::n()], but I'm curious about why it even happens.

Here's the source code for dplyr::n in dplyr 0.5.0:

function () {
    stop("This function should not be called directly")
}

That's why the fully qualified form raises this error: the function always returns an error. (My guess is that the error-throwing function dplyr::n exists so that n() could have a typical documentation page with examples.)

Inside of filter/mutate/summarise statements, n() is not calling this function. Instead, some internal function calculates the group sizes for the expression n(). That's why the following works when dplyr is not loaded:

n()
#> Error: could not find function "n"

library(magrittr)
iris %>% 
  dplyr::group_by(Species) %>% 
  dplyr::summarise(n = n())
#> # A tibble: 3 × 2
#>      Species     n
#>       <fctr> <int>
#> 1     setosa    50
#> 2 versicolor    50
#> 3  virginica    50

Here n() cannot be mapped to a function, so we get an error. But when used it inside of a dplyr verb, n() does map to something and returns group sizes.

dplyr error with summarise_ and n()

First, as explained in the comments, you mixed standard evaluation and non-standard evaluation. n() is not found because you can't use it like that in *_ functions. In dplyr before 0.7.0, you would use ~n() in summarise_.

However things have changed in the tidyverse world.

Since version 0.7.0, dplyr uses now a new system for programming with dplyr, called tidy evaluation, or tidy eval for short. All function with *_ are now deprecated and should not be used in new code, unless you want to keep a dependency on an old dplyr version. I'll advice to use tidy eval now. I will not explained it here, you could see the Programming vignette

For example, now you would do something like this with dplyr (>= 0.7.0):

library(dplyr)
# quo is a tidy eval concept for quoting
grp_var <-quo(Species)
voi <- quo(Sepal.Length)
# use !! another tidy eval concept to unquote
dmp <- iris %>%
  select(!! grp_var, !! voi) %>% 
  group_by(!! grp_var) %>%
  summarise(Median_Value = median( !! voi ), Count = n())
dmp
#> # A tibble: 3 x 3
#>      Species Median_Value Count
#>       <fctr>        <dbl> <int>
#> 1     setosa          5.0    50
#> 2 versicolor          5.9    50
#> 3  virginica          6.5    50

Using a homemade function in a dplyr pipeline generating an unused argument error

I am not sure, but I think you tried to do the following:

library(dplyr)

trial <- c(rep(1,25), rep(2, 25), rep(3, 25))
minitime <- c(1:25)
time <- c(rep(minitime, 3))
X <- c(runif(75))
Y <- c(runif(75))
df <- as.data.frame(cbind(trial, time, X, Y))

df <- df %>%
  group_by(trial) %>%
  mutate(Y2 = lead(Y, 1),
         velocity = Y2 - Y)

head(df)
#> # A tibble: 6 x 6
#> # Groups:   trial [1]
#>   trial  time       X      Y     Y2 velocity
#>   <dbl> <dbl>   <dbl>  <dbl>  <dbl>    <dbl>
#> 1     1     1 0.757   0.118  0.174    0.0565
#> 2     1     2 0.686   0.174  0.0533  -0.121 
#> 3     1     3 0.219   0.0533 0.322    0.269 
#> 4     1     4 0.243   0.322  0.0700  -0.252 
#> 5     1     5 0.158   0.0700 0.738    0.668 
#> 6     1     6 0.00458 0.738  0.323   -0.415

^{Created on 2020-11-04 by the reprex package (v0.3.0)}

As @Rui Barradas mentions in the comments: your shift function would also work:

 df <- df %>%
  group_by(trial) %>%
  mutate(Y2 = shift(Y, 1),
         velocity = Y2 - Y)

Dplyr: "Error in N(): Function Should Not Be Called Directly"