Non-Redundant Version of Expand.Grid

Non-redundant version of expand.grid

How about using outer? But this particular function concatenates them into one character string.

outer( c("aa", "ab", "cc"), c("aa", "ab", "cc") , "paste" )
# [,1] [,2] [,3]
#[1,] "aa aa" "aa ab" "aa cc"
#[2,] "ab aa" "ab ab" "ab cc"
#[3,] "cc aa" "cc ab" "cc cc"

You can also use combn on the unique elements of the two vectors if you don't want the repeating elements (e.g. aa aa)

vals <- c( c("aa", "ab", "cc"), c("aa", "ab", "cc") )
vals <- unique( vals )
combn( vals , 2 )
# [,1] [,2] [,3]
#[1,] "aa" "aa" "ab"
#[2,] "ab" "cc" "cc"

R - Expand Grid Without Duplicates

In RcppAlgos*, there is a function called comboGrid that does the trick:

library(RcppAlgos) ## as of v2.4.3
comboGrid(X1, X2, X3, repetition = F)
# Var1 Var2 Var3
# [1,] "x" "A" "C"
# [2,] "x" "A" "G"
# [3,] "x" "A" "y"
# [4,] "x" "B" "C"
# [5,] "x" "B" "G"
# [6,] "x" "B" "y"
# [7,] "x" "C" "G"
# [8,] "x" "C" "y"
# [9,] "y" "A" "C"
# [10,] "y" "A" "G"
# [11,] "y" "B" "C"
# [12,] "y" "B" "G"
# [13,] "y" "C" "G"
# [14,] "z" "A" "C"
# [15,] "z" "A" "G"
# [16,] "z" "A" "y"
# [17,] "z" "B" "C"
# [18,] "z" "B" "G"
# [19,] "z" "B" "y"
# [20,] "z" "C" "G"
# [21,] "z" "C" "y"

Large Test

set.seed(42)
rnd_lst <- lapply(1:11, function(x) {
sort(sample(LETTERS, sample(26, 1)))
})

## Number of results that expand.grid would return if your machine
## had enough memory... over 300 trillion!!!
prettyNum(prod(lengths(rnd_lst)), big.mark = ",")
# [1] "365,634,846,720"

exp_grd_test <- expand.grid(rnd_lst)
# Error: vector memory exhausted (limit reached?)

system.time(cmb_grd_test <- comboGrid(rnd_lst, repetition=FALSE))
# user system elapsed
# 9.866 0.330 10.196

dim(cmb_grd_test)
# [1] 3036012 11

head(cmb_grd_test)
# Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11
# [1,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "K"
# [2,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "L"
# [3,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "M"
# [4,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "N"
# [5,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "O"
# [6,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "P"

* I am the author of RcppAlgos

Use outer instead of expand.grid

Using rep.int:

expand.grid.alt <- function(seq1,seq2) {
cbind(rep.int(seq1, length(seq2)),
c(t(matrix(rep.int(seq2, length(seq1)), nrow=length(seq2)))))
}

expand.grid.alt(seq_len(nrow(dat)), seq_len(ncol(dat)))

In my computer is like 6 times faster than expand.grid.

How to generate an output satisfied with specific conditions from expand.grid in R

A slight modification to Vincent Zoonekynd's will take care of non-numerical factors:

a <- c(1,2,3,"X","Y","M")
eg <- expand.grid(a,a)
eg2 <- eg[as.character(eg$Var1) < as.character(eg$Var2), ]

Basically, what you need is to use string comparison instead of "plain" comparison that doesn't work on factor variables.

How to get only unique combinations of variables where entries can be in either variable

The combn function will give you all n-combinations of elements from a vector, however it does not match elements with themselves. You can add that result on fairly easily, Thus you can get the combinations you want with

cbind(combn(j,2), rbind(j,j))

# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# j "a" "a" "a" "b" "b" "c" "a" "b" "c" "d"
# j "b" "c" "d" "c" "d" "d" "a" "b" "c" "d"

combinations of combinations in R

So I think I may have cracked it. I've pillaged a couple of answers to other questions. There's a function here called expand.grid.unique which removes duplicates if you put the same vector into expand.grid twice. And there's one here, called expand.grid.df which I'm not even going to pretend to understand which expands expand.grid to work on dataframes. However, combined, they do what I want them to do.

upVariables<-c("up1", "up2", "up3", "up4", "up5")
downVariables<-c("down1", "down2", "down3", "down4", "down5")
ratioGroups<-data.frame(matrix(ncol=2, nrow=0))
colnames(ratioGroups)<-c("mix1","mix2")

ups<-expand.grid.unique(upVariables,upVariables)
downs<-expand.grid.unique(downVariables,downVariables)
comboList<-expand.grid.df(ups,downs)
comboList <- data.frame(lapply(comboList, as.character), stringsAsFactors=FALSE)
colnames(comboList)<-c("u1","u2","d1","d2")

There's a bunch of faffing about in there converting everything back to strings because everything gets converted to factors for some reason.

If I put Jota's answer into a function:

getGroups<-function(line){
n<-2 #the number ratios being used.
combos <- expand.grid(as.character(line[1:2]), as.character(line[3:4]))
combos <- combos[with(combos, order(Var1)), ] # use dplyr::arrange if you prefer
mat <- matrix(1:n^2, byrow = TRUE, nrow = n)
for(j in 2:nrow(mat) ) mat[j, ] <- mat[j, c(j:ncol(mat), 1:(j - 1))]
pairs<-(split(combos[c(mat), ], rep(1:n, each = n)))
collapsed<-sapply(lapply(pairs, apply, 1, paste, collapse = '_'), paste, collapse = '-')
}

I can then use

ratiosGroups<-as.vector(apply(comboList,1,getGroups))

to return a list of all possible combinations. I'm guessing this still isn't the best way to achieve my larger goal, but it's getting there.

All Possible Pairs between Two Vectors in R Without Replacement

It seems a permutation problem, which might be solved like below

> library(pracma)

> paste0(v1, t(perms(v2)))
[1] "AZ" "BY" "CX" "AZ" "BX" "CY" "AY" "BZ" "CX" "AY" "BX" "CZ" "AX" "BY" "CZ"
[16] "AX" "BZ" "CY"

or

out <- data.frame(
Var1 = v1,
Var2 = c(t(perms(v2))),
Match = ceiling(seq(factorial(length(v2)) * length(v2)) / length(v1))
)

which gives

> out
Var1 Var2 Match
1 A Z 1
2 B Y 1
3 C X 1
4 A Z 2
5 B X 2
6 C Y 2
7 A Y 3
8 B Z 3
9 C X 3
10 A Y 4
11 B X 4
12 C Z 4
13 A X 5
14 B Y 5
15 C Z 5
16 A X 6
17 B Z 6
18 C Y 6

How to expand.grid on vectors sets rather than single elements

We could try

do.call(cbind,lapply(expand.grid(list(a1, a2), list(b1,b2)),
function(x) matrix(do.call(c, x), 4, 2, byrow=TRUE)))
# [,1] [,2] [,3] [,4]
#[1,] 11 12 31 32
#[2,] 21 22 31 32
#[3,] 11 12 41 42
#[4,] 21 22 41 42


Related Topics



Leave a reply



Submit