Combination of Expand.Grid and Mapply

combination of expand.grid and mapply?

I think I have a solution to my own question, but perhaps someone can do better (and I haven't implemented FLATTEN=FALSE ...)

xapply <- function(FUN,...,FLATTEN=TRUE,MoreArgs=NULL) {
L <- list(...)
inds <- do.call(expand.grid,lapply(L,seq_along)) ## Marek's suggestion
retlist <- list()
for (i in 1:nrow(inds)) {
arglist <- mapply(function(x,j) x[[j]],L,as.list(inds[i,]),SIMPLIFY=FALSE)
if (FLATTEN) {
retlist[[i]] <- do.call(FUN,c(arglist,MoreArgs))
}
}
retlist
}

edit: I tried @baptiste's suggestion, but it's not easy (or wasn't for me). The closest I got was

xapply2 <- function(FUN,...,FLATTEN=TRUE,MoreArgs=NULL) {
L <- list(...)
xx <- do.call(expand.grid,L)
f <- function(...) {
do.call(FUN,lapply(list(...),"[[",1))
}
mlply(xx,f)
}

which still doesn't work. expand.grid is indeed more flexible than I thought (although it creates a weird data frame that can't be printed), but enough magic is happening inside mlply that I can't quite make it work.

Here is a test case:

L1 <- list(data.frame(x=1:10,y=1:10),
data.frame(x=runif(10),y=runif(10)),
data.frame(x=rnorm(10),y=rnorm(10)))

L2 <- list(y~1,y~x,y~poly(x,2))
z <- xapply(lm,L2,L1)
xapply(lm,L2,L1)

mapply / expand.grid () for argument's combination with a condition

OK, here's a solution to both problems. Unfortunately, I couldn't get one using mapply so I had to rely on a good old for loop (but it's still faster, given that it doesn't have to do all the extra calculations). Also, I changed the function to give you the names of the variables as you wanted. The biggest difference is that I'm not using expand.grid but merge. Finally, it incorporates your comment from above.

des <- function(var, tmin, tmax, cor.var, cor.method = c("spearman", "pearson", "kendall")){
var[var < quantile(var, probs = tmin, na.rm = TRUE) |
var > quantile(var, probs = tmax, na.rm = TRUE)] <- NA
d <- psych::describe(var)
correlation<- cor(cor.var, var, use="pairwise.complete", match.arg(cor.method))
df <- cbind(variable = names(var), tmin = tmin, tmax = tmax, d, correlation)
names(df)[length(names(df))]<- paste0("correlation_with_", names(cor.var))
print(df)
}

minmax = data.frame(tmin = c(0.01, 0.05, 0.1, 0.2, 0.25), tmax = c(0.99, 0.95, 0.9, 0.8, 0.75))
args<- merge(c("Var2", "Var4", "Var5"), minmax)
args[,1]<- as.character(args[,1])

alltogether<- NULL
for (i in 1:nrow(args)){
alltogether<- rbind(alltogether, des(var = dataframe[args[i,1]],
tmin = args[i, 2], tmax=args[i, 3], cor.var = dataframe["Var1"]))
}

R: expand grid of all possible combinations within groups and apply functions across all the pairs

Use exand.grid you get all possible combination of columns, split the data by time and apply fun for each row of tmp.

library(dplyr)
library(purrr)

tmp <- expand.grid(firm1 = names(data[-1]), firm2 = names(data[-1]))

fun <- function(x, y) sum(x, y)

result <- data %>%
group_split(time) %>%
map_df(~cbind(time = .x$time[1], tmp,
value = apply(tmp, 1, function(x) fun(.x[[x[1]]], .x[[x[2]]]))))

result

# time firm1 firm2 value
#1 1 a a 6
#2 1 b a 10
#3 1 c a 5
#4 1 a b 10
#5 1 b b 14
#6 1 c b 9
#7 1 a c 5
#8 1 b c 9
#9 1 c c 4
#10 2 a a 14
#11 2 b a 10
#12 2 c a 9
#13 2 a b 10
#14 2 b b 6
#15 2 c b 5
#16 2 a c 9
#17 2 b c 5
#18 2 c c 4

You may also do this in base R -

result <- do.call(rbind, by(data, data$time, function(x) {
cbind(time = x$time[1], tmp,
value = apply(tmp, 1, function(y) fun(x[[y[1]]], x[[y[2]]])))
}))

mapply for all arguments' combinations [R]

You could combine do.call with expand.grid to work with unlimited amount of input as follows

toy <- function(...) do.call(paste, expand.grid(...))

Then, you could do

x = 1:2 ; y = c("#", "$") ; z = c("a", "b")
toy(x, y, z)
# [1] "1 # a" "2 # a" "1 $ a" "2 $ a" "1 # b" "2 # b" "1 $ b" "2 $ b"

This will work for any input. You could try stuff like toy(x, y, z, y, y, y, z), for instance in order to validate.

Apply a function over pairwise combinations of arguments

Does this work: outer(1:12, 1:9, func)?

If not, use matrix(apply(expand.grid(1:12, 1:9), 1, func), 12, 9).

Background: "dims [product xx] do not match the length of object [xx]" error in using R function `outer`

Most efficient way of getting all combinations of two inputs, possibly with mapply R

This works well. outer takes all the combinations and you can specify to sum them.

outer(a,b,"+")
#> [,1] [,2] [,3]
#> [1,] 3 4 5
#> [2,] 4 5 6
#> [3,] 5 6 7

expand.grid will be slower but will create a two column data.frame that you can work on. Note, with expand.grid you can pass in any number of vectors, but the output will grow quickly. So, this won't work if you have a million things to cross with a million things (resulting in a trillion things). Also worth noting that this is a base function, but you could also try tidyr::crossing(a,b) or data.table::CJ which could be faster (though not in this toy example).

df = expand.grid(a=a,b=b)
head(df)
#> a b
#> 1 1 2
#> 2 2 2
#> 3 3 2
#> 4 1 3
#> 5 2 3
#> 6 3 3

From this you can easily extract the sums.

r = df$a+df$b
r
#> [1] 3 4 5 4 5 6 5 6 7

Map/mapply with all possible combinations of two lists

you can use the following solution. It is better if we use purrr::walk2 in place of purrr::map2 as we are calling download.file for its side effect, so walk2 would is a better option:

library(purrr)

# First we create a data frame of all combinations of countries and years
comb <- expand.grid(countries, years)

# Then I wrap `download.file` with possibly for error handling
poss_download <- possibly(download.file, otherwise = NA)

# Then I apply our function on every combination of countries and years
# in a row-wise operation

walk2(comb$Var1, comb$Var2, ~ {
url = paste0("https://www.colef.mx/emif/datasets/basesdeDatos/sur/", .y, "/DEUA", .x, "%20", .y, ".csv")
destfile = paste0(raw_data, .x, .y,".csv")
poss_download(url, destfile)
})

Here is a base R solution for this question.

  • Instead of paste0 I used sprintf function which according to documentation "returns a character vector containing a formatted combination of text and variable values". I used %d for integer/numeric values(2 times for years) and %s for character strings (once for countries) and it should be noted that we have to provide as many variables so that they are incorporated in their places to form a single string of length one
  • Then I used tryCatch in place of purrr::possibly to handle possible errors
  • In the end I used mapply or Map to iterate on both vectors url and destfile at the same time
comb <- expand.grid(countries, years)

url <- sprintf("https://www.colef.mx/emif/datasets/basesdeDatos/sur/%d/DEUA%s%d.csv", comb$Var2, comb$Var1, comb$Var2)

destfile = paste0(raw_data, comb$Var1, comb$Var2,".csv")

mapply(function(x, y) {
tryCatch(download.file(url, destfile),
error = function(e) {
NA
})
}, url, destfile)

Expand grid of all possible combinations within groups

You could do that with dplyr and expand from tidyr.

df <- read.table(text="dealid acquirer target vendor
1 FirmA FirmB FirmC
1 FirmD NA FirmE
2 FirmA NA FirmC
2 FirmD NA FirmE
2 FirmG FirmF FirmE",header=TRUE,stringsAsFactors=FALSE)

library(dplyr);library(tidyr)
df%>%
group_by(dealid)%>%
expand(acquirer, target, vendor)

dealid acquirer target vendor
<int> <chr> <chr> <chr>
1 1 FirmA FirmB FirmC
2 1 FirmA FirmB FirmE
3 1 FirmD FirmB FirmC
4 1 FirmD FirmB FirmE
5 2 FirmA FirmF FirmC
6 2 FirmA FirmF FirmE
7 2 FirmD FirmF FirmC
8 2 FirmD FirmF FirmE
9 2 FirmG FirmF FirmC
10 2 FirmG FirmF FirmE

Passing multiple arguments to lapply from a dataframe

With apply in the second argument you need to pass the MARGIN which is to specify if you want to apply the function row-wise (1) or column-wise (2). Also you would need to use an anonymous function here since your function accepts two separate arguments.

apply(combined, 1, function(x) my_func(x[1], x[2]))


Related Topics



Leave a reply



Submit