Dplyr::Do() Requires Named Function

dplyr::do() requires named function?

You don't need an anonymous function:

library(dplyr)
iris %>%
  group_by(Species) %>%
  do({
    mod <- lm(Sepal.Length ~ Sepal.Width, data = .)
    pred <- predict(mod, newdata = .["Sepal.Width"])
    data.frame(., pred)
  })

dplyr - using column names as function arguments

This can work using the latest dplyr syntax (as can be seen on github):

library(dplyr)
library(rlang)
sumByColumn <- function(df, colName) {
  df %>%
    group_by(a) %>%
    summarize(tot = sum(!! sym(colName)))
}

sumByColumn(data, "b")
## A tibble: 2 x 2
#      a   tot
#  <int> <int>
#1     1    24
#2     2    27

And an alternative way of specifying b as a variable:

library(dplyr)
sumByColumn <- function(df, colName) {
  myenc <- enquo(colName)
  df %>%
    group_by(a) %>%
    summarize(tot = sum(!!myenc))
}

sumByColumn(data, b)
## A tibble: 2 x 2
#      a   tot
#  <int> <int>
#1     1    24
#2     2    27

Passing column name as argument in function within pipes

You need to make use of non standard evaluation which is worth a quick read about. In this case you most likely need to !! infront of var in the mutate line.

Here's the line:

mutate(new_variable = !!sym(var) * 100)

Dplyr write a function with column names as inputs

Is this what you expected?

df<-tbl_df(data.frame(group=rep(c("A", "B"), each=3), var1=sample(1:100, 6), var2=sample(1:100, 6)))

example<-function(colname){
  df %>%
    group_by(group)%>%
    summarize(output=mean(sqrt(colname)))%>%
    select(output)
}
example( quote(var1) )
#-----
Source: local data frame [2 x 1]

    output
1 7.185935
2 8.090866

Passing (function) user-specified column name to dplyr do()

This is because of regular do() semantics where there is no data masking apart from .:

do(df, data.frame(y = sum(.$response)))
#>   y
#> 1 6

do(df, data.frame(y = sum(.[[response]])))
#> Error: object 'response' not found

So you just need to capture the bare column name as a string and there is no need to unquote since there is no data masking:

sum_with_do <- function(df, x, ...) {
  # ensym() guarantees that `x` is a simple column name and not a
  # complex expression:
  x <- as.character(ensym(x))

  df %>%
    group_by(...) %>%
    do(data.frame(y = sum(.[[x]])))
}

How to refer to variable (column name) with tidyverse in a function?

You can call the function using symbols rather than strings for the column names by using the {{ ('curly curly') operator:

library(tidyverse)

f3 <- function(x){
 mtcars %>% 
    group_by(cyl, gear) %>% 
    summarize(m = mean({{x}}), 
         sd = sd({{x}}),
         n = length({{x}}),
         se = sd / sqrt(n),
         tscore = qt(0.975, n-1),
         margin = tscore * se,
         uppma = m + margin,
         lowma = m - margin,
         .groups = 'drop')
}

f3(x = wt)
#> # A tibble: 8 x 10
#>     cyl  gear     m     sd     n     se tscore  margin  uppma   lowma
#>   <dbl> <dbl> <dbl>  <dbl> <int>  <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
#> 1     4     3  2.46 NA         1 NA     NaN    NaN     NaN    NaN    
#> 2     4     4  2.38  0.601     8  0.212   2.36   0.502   2.88   1.88 
#> 3     4     5  1.83  0.443     2  0.314  12.7    3.98    5.81  -2.16 
#> 4     6     3  3.34  0.173     2  0.123  12.7    1.56    4.89   1.78 
#> 5     6     4  3.09  0.413     4  0.207   3.18   0.657   3.75   2.44 
#> 6     6     5  2.77 NA         1 NA     NaN    NaN     NaN    NaN    
#> 7     8     3  4.10  0.768    12  0.222   2.20   0.488   4.59   3.62 
#> 8     8     5  3.37  0.283     2  0.2    12.7    2.54    5.91   0.829

How do I write a dplyr pipe-friendly function where a new column name is provided from a function argument?

In this case you can just stick to using the embrace {{}} option for your variables. If you want to dynamically create column names, you're going to still need to use :=. The difference here is that you can use the glue-style syntax with the embrace operator to get the name of the symbol. This works with the data provided.

elective_open <- function(.data, name_for_elective, course, tiebreaker){ 
  .data%>%
    mutate("{{name_for_elective}}" := ifelse({{tiebreaker}}==max({{tiebreaker}}),1,0)) %>%
    mutate("{{name_for_elective}}" := ifelse({{name_for_elective}}==0,{{course}}[{{name_for_elective}}==1],"")) %>%
    filter(!({{course}} %in% {{name_for_elective}}))
}

Can't use dplyr::arrange() to sort a column in the form of a date in r

Instead of the double quoted column name, use backquote

library(dplyr)
values %>% 
   dplyr::arrange(`2022-03-01`)

-output

   2022-03-01
J        0.6
E        2.0
A        2.7
B        3.7
C        5.7
I        6.3
H        6.6
F        9.0
D        9.1
G        9.4

If we want to pass as string, either use within across

values %>%
   dplyr::arrange(across("2022-03-01"))
  2022-03-01
J        0.6
E        2.0
A        2.7
B        3.7
C        5.7
I        6.3
H        6.6
F        9.0
D        9.1
G        9.4

Or convert to symbol and evaluate (!!)

values %>%
  dplyr::arrange(!! rlang::sym("2022-03-01"))
  2022-03-01
J        0.6
E        2.0
A        2.7
B        3.7
C        5.7
I        6.3
H        6.6
F        9.0
D        9.1
G        9.4

Or with .data

values %>% 
  dplyr::arrange(.data[["2022-03-01"]])

Dplyr::Do() Requires Named Function