Using anonymous functions with summarize_each or mutate_each
It's a matter of using a lot of parentheses so everything gets evaluated:
df_foo %>%
summarize_each(funs(((function(bar){sum(bar/10)})(.))))
#
# Source: local data frame [1 x 2]
#
# x y
# (dbl) (dbl)
# 1 1.113599 -0.4766853
where you need
- parentheses around the function definition so it gets defined,
- a set of parentheses with a
.
to tellfuns
which parameter to stick the data passed to it in (seemingly redundant with single-parameter functions, but not so with multi-parameter ones; see?funs
for more examples), and - parentheses around the whole thing to actually evaluate it,
which is kind of ridiculous, but that seems to be the most concise funs
can handle. It makes some sense if you look at what you'd have to write to evaluate a similar anonymous function on its own line, e.g.
(function(bar){sum(bar/10)})(df_foo$x)
though the pair wrapping the whole thing are extra for funs
. You can use braces {}
instead for the outer pair if you prefer, which might make more syntactic sense.
How to use anonymous functions for mutate_each (and summarise_each)?
We can wrap the function call with parentheses
df %>%
mutate_each(funs(((function(x){x/2})(.))))
Using an anonymous function in mutate
Apparently what you need is a whole bunch of parentheses. See https://stackoverflow.com/a/36906989/3277050
In your situation it looks like:
files.split.df <- files.paths.df %>%
mutate(
no.ext = (function(x) {sub(paste0(".", x["extension"], "$"), "", x["file"])})(.)
)
So it seems like if you wrap the whole function definition in brackets you can then treat it like a regular function and supply arguments to it.
New Answer
Really this is not the right way to use mutate at all though. I got focused in on the anonymous function part first without looking at what you are actually doing. What you need is a vectorized version of sub. So I used str_replace
from the stringr
package. Then you can just refer to columns by name because that is the beauty of dplyr:
library(tidyr)
library(dplyr)
library(stringr)
files.split.df <- files.paths.df %>%
mutate(
no.ext = str_replace(file, paste0(".", extension, "$"), ""))
Edit to Answer Comment
To use a user defined function where there isn't an existing vectorized function you could use Vectorize
like this:
string_fun <- Vectorize(function(x, y) {sub(paste0(".", x, "$"), "", y)})
files.split.df <- files.paths.df %>%
mutate(
no.ext = string_fun(extension, file))
Or if you really don't want to name the function, which I do not recommend as it is much harder to read:
files.split.df <- files.paths.df %>%
mutate(
no.ext = (Vectorize(function(x, y) {sub(paste0(".", x, "$"), "", y)}))(extension, file))
Update a subset of a df with mutate_each
As commented @alistaire, you can use mutate_at
to convert only those date
columns and keep the rest of the data frames unchanged, so that you can avoid binding the original data frame with the subsets:
library(dplyr)
muX <- x %>% mutate_at(vars(contains('date')), funs(as.Date(., origin="1900-01-01")))
head(muX)
# date1 date2 var1 var2
# 1 2021-11-09 2038-10-20 44.524710 86.15957
# 2 2020-06-04 2037-08-04 31.402905 94.74633
# 3 2023-12-22 2038-03-06 31.600929 85.90605
# 4 2020-05-08 2037-01-02 7.140777 82.80565
# 5 2025-03-25 2038-07-30 -54.913577 100.83949
# 6 2021-02-18 2034-06-20 28.616670 93.92246
And also according to ?mutate_at
:
summarise_each() and mutate_each() are older variants that will be
deprecated in the future.
Better get used to these new APIs.
How to change the now deprecated dplyr::funs() which includes an ifelse argument?
As of dplyr
0.8.0, the documentation states that we should use list
instead of funs
, giving the example:
Before:
funs(name = f(.))
After:
list(name = ~f(.))
So here, the call funs(ifelse(is.character(.), trimws(.),.))
can become instead list(~ifelse(is.character(.), trimws(.),.))
. This is using the formula notation for anonymous functions in the tidyverse
, where a one-sided formula (expression beginning with ~
) is interpreted as function(x)
, and wherever x
would go in the function is represented by .
. You can still use full functions inside list
.
Note the difference between the .funs
argument of mutate_if
and the funs()
function which wrapped other functions to pass to .funs
; i.e. .funs = gsub
still works. You only needed funs()
if you needed to apply multiple functions to selected columns or to name them something by passing them as named arguments. You can do all the same things with list()
.
You also are duplicating work by adding ifelse
inside mutate_if
; that line could be simplified to mutate_if(is.character, trimws)
since if the column is character already you don't need to check it again with ifelse
. Since you apply only one function, no need for funs
or list
at all.
Related Topics
Why Ggplot2 Legend Not Show in The Graph
Is There More Efficient or Concise Way to Use Tidyr::Gather to Make My Data Look 'Tidy'
Count Number of Values in Row Using Dplyr
Error Trying to Read a PDF Using Readpdf from The Tm Package
R: Remove Repeating Row Entries in Gridextra Table
Flag First By-Group in R Data Frame
Obtain Date Column from Xts Object
R - Column Names in Read.Table and Write.Table Starting with Number and Containing Space
Inserting Logo into Beamer Presentation Using R Markdown
How to Plot Contours on a Map with Ggplot2 When Data Is on an Irregular Grid
Rstudio Viewer Pane Not Working
Plot Weighted Frequency Matrix
How to Show Directlabels After Geom_Smooth and Not After Geom_Line
Why Can't One Have Several 'Value.Var' in 'Dcast'
R: How to Expand a Row Containing a "List" to Several Rows...One for Each List Member