Dplyr::N() Returns "Error: Error: N() Should Only Be Called in a Data Context "

dplyr::n() returns Error: This function should not be called directly

So, I do not really have a problem, I can just avoid [writing dplyr::n()], but I'm curious about why it even happens.

Here's the source code for dplyr::n in dplyr 0.5.0:

function () {
    stop("This function should not be called directly")
}

That's why the fully qualified form raises this error: the function always returns an error. (My guess is that the error-throwing function dplyr::n exists so that n() could have a typical documentation page with examples.)

Inside of filter/mutate/summarise statements, n() is not calling this function. Instead, some internal function calculates the group sizes for the expression n(). That's why the following works when dplyr is not loaded:

n()
#> Error: could not find function "n"

library(magrittr)
iris %>% 
  dplyr::group_by(Species) %>% 
  dplyr::summarise(n = n())
#> # A tibble: 3 × 2
#>      Species     n
#>       <fctr> <int>
#> 1     setosa    50
#> 2 versicolor    50
#> 3  virginica    50

Here n() cannot be mapped to a function, so we get an error. But when used it inside of a dplyr verb, n() does map to something and returns group sizes.

dplyr: Error in n(): function should not be called directly

I presume you have dplyr and plyr loaded in the same session. dplyr is not plyr. ddply is not a function in the dplyr package.

Both dplyr and plyr have the functions summarise/summarize.

Look at the results of conflicts() to see masked objects.

Error: `n()` must only be used inside dplyr verbs

As far as I can tell, unlike dplyr (which accepts pretty much any summary function that returns a scalar, as well as its own specialized functions such as n()), srvyr::summarize gives you a limited choice of functions: from ?srvyr::summarize,

Summarise for ‘tbl_svy’ objects accepts several specialized
functions. [emphasis added]

i.e., survey_mean, survey_total, survey_ratio, and a couple of others

Here's a hack that seems to work: calculate the sum (survey_total) of the inverse weights.

library(srvyr)
data(api, package="survey")
aa <- (apistrat 
      %>% as_survey_design(strata=stype, weights=pw) 
      %>% group_by(stype) 
)
aa %>% summarize(n=survey_total(1/pw))

This matches table(apistrat$stype)

Transpose a dataframe in case of rows contain two values for the same variable in R

Here is one way to do it. We can create an ID for each Marker and then create a column. After that, we can convert it to wide format.

library(dplyr)
library(tidyr)

data2 <- data %>%
  group_by_at(vars(-value)) %>%
  mutate(N = row_number() - 1) %>%
  unite(col = "Marker", Marker, N, sep = ".") %>%
  pivot_wider(names_from = "Marker", values_from = "value") %>%
  ungroup()
data2
# # A tibble: 2 x 8
#   Sample.File Sample.Name xxx.0 xxx.1 yyy.0 yyy.1 zzz.0 zzz.1
#   <fct>       <fct>       <int> <int> <int> <int> <int> <int>
# 1 a           a_1            16    18    16    20     9    13
# 2 b           b_1            10    10     6    12    14    14

Dplyr::N() Returns "Error: Error: N() Should Only Be Called in a Data Context "

dplyr::n() returns Error: This function should not be called directly

dplyr: Error in n(): function should not be called directly

Error: `n()` must only be used inside dplyr verbs

Transpose a dataframe in case of rows contain two values for the same variable in R

Related Topics

Leave a reply