Dplyr::N() Returns "Error: Error: N() Should Only Be Called in a Data Context "

dplyr::n() returns Error: This function should not be called directly

So, I do not really have a problem, I can just avoid [writing dplyr::n()], but I'm curious about why it even happens.

Here's the source code for dplyr::n in dplyr 0.5.0:

function () {
stop("This function should not be called directly")
}

That's why the fully qualified form raises this error: the function always returns an error. (My guess is that the error-throwing function dplyr::n exists so that n() could have a typical documentation page with examples.)

Inside of filter/mutate/summarise statements, n() is not calling this function. Instead, some internal function calculates the group sizes for the expression n(). That's why the following works when dplyr is not loaded:

n()
#> Error: could not find function "n"

library(magrittr)
iris %>%
dplyr::group_by(Species) %>%
dplyr::summarise(n = n())
#> # A tibble: 3 × 2
#> Species n
#> <fctr> <int>
#> 1 setosa 50
#> 2 versicolor 50
#> 3 virginica 50

Here n() cannot be mapped to a function, so we get an error. But when used it inside of a dplyr verb, n() does map to something and returns group sizes.

dplyr: Error in n(): function should not be called directly

I presume you have dplyr and plyr loaded in the same session. dplyr is not plyr. ddply is not a function in the dplyr package.

Both dplyr and plyr have the functions summarise/summarize.

Look at the results of conflicts() to see masked objects.

Error: `n()` must only be used inside dplyr verbs

As far as I can tell, unlike dplyr (which accepts pretty much any summary function that returns a scalar, as well as its own specialized functions such as n()), srvyr::summarize gives you a limited choice of functions: from ?srvyr::summarize,

Summarise for ‘tbl_svy’ objects accepts several specialized
functions
. [emphasis added]

i.e., survey_mean, survey_total, survey_ratio, and a couple of others

Here's a hack that seems to work: calculate the sum (survey_total) of the inverse weights.

library(srvyr)
data(api, package="survey")
aa <- (apistrat
%>% as_survey_design(strata=stype, weights=pw)
%>% group_by(stype)
)
aa %>% summarize(n=survey_total(1/pw))

This matches table(apistrat$stype)

Transpose a dataframe in case of rows contain two values for the same variable in R

Here is one way to do it. We can create an ID for each Marker and then create a column. After that, we can convert it to wide format.

library(dplyr)
library(tidyr)

data2 <- data %>%
group_by_at(vars(-value)) %>%
mutate(N = row_number() - 1) %>%
unite(col = "Marker", Marker, N, sep = ".") %>%
pivot_wider(names_from = "Marker", values_from = "value") %>%
ungroup()
data2
# # A tibble: 2 x 8
# Sample.File Sample.Name xxx.0 xxx.1 yyy.0 yyy.1 zzz.0 zzz.1
# <fct> <fct> <int> <int> <int> <int> <int> <int>
# 1 a a_1 16 18 16 20 9 13
# 2 b b_1 10 10 6 12 14 14


Related Topics



Leave a reply



Submit