combination of expand.grid and mapply?
I think I have a solution to my own question, but perhaps someone can do better (and I haven't implemented FLATTEN=FALSE
...)
xapply <- function(FUN,...,FLATTEN=TRUE,MoreArgs=NULL) {
L <- list(...)
inds <- do.call(expand.grid,lapply(L,seq_along)) ## Marek's suggestion
retlist <- list()
for (i in 1:nrow(inds)) {
arglist <- mapply(function(x,j) x[[j]],L,as.list(inds[i,]),SIMPLIFY=FALSE)
if (FLATTEN) {
retlist[[i]] <- do.call(FUN,c(arglist,MoreArgs))
}
}
retlist
}
edit: I tried @baptiste's suggestion, but it's not easy (or wasn't for me). The closest I got was
xapply2 <- function(FUN,...,FLATTEN=TRUE,MoreArgs=NULL) {
L <- list(...)
xx <- do.call(expand.grid,L)
f <- function(...) {
do.call(FUN,lapply(list(...),"[[",1))
}
mlply(xx,f)
}
which still doesn't work. expand.grid
is indeed more flexible than I thought (although it creates a weird data frame that can't be printed), but enough magic is happening inside mlply
that I can't quite make it work.
Here is a test case:
L1 <- list(data.frame(x=1:10,y=1:10),
data.frame(x=runif(10),y=runif(10)),
data.frame(x=rnorm(10),y=rnorm(10)))
L2 <- list(y~1,y~x,y~poly(x,2))
z <- xapply(lm,L2,L1)
xapply(lm,L2,L1)
mapply / expand.grid () for argument's combination with a condition
OK, here's a solution to both problems. Unfortunately, I couldn't get one using mapply
so I had to rely on a good old for
loop (but it's still faster, given that it doesn't have to do all the extra calculations). Also, I changed the function to give you the names of the variables as you wanted. The biggest difference is that I'm not using expand.grid
but merge
. Finally, it incorporates your comment from above.
des <- function(var, tmin, tmax, cor.var, cor.method = c("spearman", "pearson", "kendall")){
var[var < quantile(var, probs = tmin, na.rm = TRUE) |
var > quantile(var, probs = tmax, na.rm = TRUE)] <- NA
d <- psych::describe(var)
correlation<- cor(cor.var, var, use="pairwise.complete", match.arg(cor.method))
df <- cbind(variable = names(var), tmin = tmin, tmax = tmax, d, correlation)
names(df)[length(names(df))]<- paste0("correlation_with_", names(cor.var))
print(df)
}
minmax = data.frame(tmin = c(0.01, 0.05, 0.1, 0.2, 0.25), tmax = c(0.99, 0.95, 0.9, 0.8, 0.75))
args<- merge(c("Var2", "Var4", "Var5"), minmax)
args[,1]<- as.character(args[,1])
alltogether<- NULL
for (i in 1:nrow(args)){
alltogether<- rbind(alltogether, des(var = dataframe[args[i,1]],
tmin = args[i, 2], tmax=args[i, 3], cor.var = dataframe["Var1"]))
}
R: expand grid of all possible combinations within groups and apply functions across all the pairs
Use exand.grid
you get all possible combination of columns, split the data by time and apply fun
for each row of tmp
.
library(dplyr)
library(purrr)
tmp <- expand.grid(firm1 = names(data[-1]), firm2 = names(data[-1]))
fun <- function(x, y) sum(x, y)
result <- data %>%
group_split(time) %>%
map_df(~cbind(time = .x$time[1], tmp,
value = apply(tmp, 1, function(x) fun(.x[[x[1]]], .x[[x[2]]]))))
result
# time firm1 firm2 value
#1 1 a a 6
#2 1 b a 10
#3 1 c a 5
#4 1 a b 10
#5 1 b b 14
#6 1 c b 9
#7 1 a c 5
#8 1 b c 9
#9 1 c c 4
#10 2 a a 14
#11 2 b a 10
#12 2 c a 9
#13 2 a b 10
#14 2 b b 6
#15 2 c b 5
#16 2 a c 9
#17 2 b c 5
#18 2 c c 4
You may also do this in base R -
result <- do.call(rbind, by(data, data$time, function(x) {
cbind(time = x$time[1], tmp,
value = apply(tmp, 1, function(y) fun(x[[y[1]]], x[[y[2]]])))
}))
mapply for all arguments' combinations [R]
You could combine do.call
with expand.grid
to work with unlimited amount of input as follows
toy <- function(...) do.call(paste, expand.grid(...))
Then, you could do
x = 1:2 ; y = c("#", "$") ; z = c("a", "b")
toy(x, y, z)
# [1] "1 # a" "2 # a" "1 $ a" "2 $ a" "1 # b" "2 # b" "1 $ b" "2 $ b"
This will work for any input. You could try stuff like toy(x, y, z, y, y, y, z)
, for instance in order to validate.
Apply a function over pairwise combinations of arguments
Does this work: outer(1:12, 1:9, func)
?
If not, use matrix(apply(expand.grid(1:12, 1:9), 1, func), 12, 9)
.
Background: "dims [product xx] do not match the length of object [xx]" error in using R function `outer`
Most efficient way of getting all combinations of two inputs, possibly with mapply R
This works well. outer
takes all the combinations and you can specify to sum them.
outer(a,b,"+")
#> [,1] [,2] [,3]
#> [1,] 3 4 5
#> [2,] 4 5 6
#> [3,] 5 6 7
expand.grid
will be slower but will create a two column data.frame that you can work on. Note, with expand.grid
you can pass in any number of vectors, but the output will grow quickly. So, this won't work if you have a million things to cross with a million things (resulting in a trillion things). Also worth noting that this is a base function, but you could also try tidyr::crossing(a,b)
or data.table::CJ
which could be faster (though not in this toy example).
df = expand.grid(a=a,b=b)
head(df)
#> a b
#> 1 1 2
#> 2 2 2
#> 3 3 2
#> 4 1 3
#> 5 2 3
#> 6 3 3
From this you can easily extract the sums.
r = df$a+df$b
r
#> [1] 3 4 5 4 5 6 5 6 7
Map/mapply with all possible combinations of two lists
you can use the following solution. It is better if we use purrr::walk2
in place of purrr::map2
as we are calling download.file
for its side effect, so walk2
would is a better option:
library(purrr)
# First we create a data frame of all combinations of countries and years
comb <- expand.grid(countries, years)
# Then I wrap `download.file` with possibly for error handling
poss_download <- possibly(download.file, otherwise = NA)
# Then I apply our function on every combination of countries and years
# in a row-wise operation
walk2(comb$Var1, comb$Var2, ~ {
url = paste0("https://www.colef.mx/emif/datasets/basesdeDatos/sur/", .y, "/DEUA", .x, "%20", .y, ".csv")
destfile = paste0(raw_data, .x, .y,".csv")
poss_download(url, destfile)
})
Here is a base R solution for this question.
- Instead of
paste0
I usedsprintf
function which according to documentation "returns a character vector containing a formatted combination of text and variable values". I used%d
for integer/numeric values(2 times for years) and%s
for character strings (once for countries) and it should be noted that we have to provide as many variables so that they are incorporated in their places to form a single string of length one - Then I used
tryCatch
in place ofpurrr::possibly
to handle possible errors - In the end I used
mapply
orMap
to iterate on both vectorsurl
anddestfile
at the same time
comb <- expand.grid(countries, years)
url <- sprintf("https://www.colef.mx/emif/datasets/basesdeDatos/sur/%d/DEUA%s%d.csv", comb$Var2, comb$Var1, comb$Var2)
destfile = paste0(raw_data, comb$Var1, comb$Var2,".csv")
mapply(function(x, y) {
tryCatch(download.file(url, destfile),
error = function(e) {
NA
})
}, url, destfile)
Expand grid of all possible combinations within groups
You could do that with dplyr
and expand
from tidyr
.
df <- read.table(text="dealid acquirer target vendor
1 FirmA FirmB FirmC
1 FirmD NA FirmE
2 FirmA NA FirmC
2 FirmD NA FirmE
2 FirmG FirmF FirmE",header=TRUE,stringsAsFactors=FALSE)
library(dplyr);library(tidyr)
df%>%
group_by(dealid)%>%
expand(acquirer, target, vendor)
dealid acquirer target vendor
<int> <chr> <chr> <chr>
1 1 FirmA FirmB FirmC
2 1 FirmA FirmB FirmE
3 1 FirmD FirmB FirmC
4 1 FirmD FirmB FirmE
5 2 FirmA FirmF FirmC
6 2 FirmA FirmF FirmE
7 2 FirmD FirmF FirmC
8 2 FirmD FirmF FirmE
9 2 FirmG FirmF FirmC
10 2 FirmG FirmF FirmE
Passing multiple arguments to lapply from a dataframe
With apply
in the second argument you need to pass the MARGIN
which is to specify if you want to apply the function row-wise (1) or column-wise (2). Also you would need to use an anonymous function here since your function accepts two separate arguments.
apply(combined, 1, function(x) my_func(x[1], x[2]))
Related Topics
Ggplot Legend Showing Transparency and Fill Color
Referring to Previous Row in Calculation
How to Place +/- Plus Minus Operator in Text Annotation of Plot (Ggplot2)
Error in Dev.Off(): Cannot Shut Down Device 1 (The Null Device)
Changing The Radius of a Coord_Polar Ggplot
How to Plot Classification Borders on an Linear Discrimination Analysis Plot in R
Evaluate Different Logical Conditions from String for Each Row
Specifying Gpar Settings for Grid Arrows in R
Store Output from Gridextra::Grid.Arrange into an Object
R: Raster Mosaic from List of Rasters
Creating Categorical Variables from Mutually Exclusive Dummy Variables
R - Stuck with Plot() - Colouring Shapefile Polygons Based Upon a Slot Value
R: Check If Value from Dataframe Is Within Range Other Dataframe
Numerical Triple Integration in R
R Produces "Unsupported Url Scheme" Error When Getting Data from Https Sites
How to Use Different Color Palettes for Different Layers in Ggplot2