Non-redundant version of expand.grid
How about using outer
? But this particular function concatenates them into one character string.
outer( c("aa", "ab", "cc"), c("aa", "ab", "cc") , "paste" )
# [,1] [,2] [,3]
#[1,] "aa aa" "aa ab" "aa cc"
#[2,] "ab aa" "ab ab" "ab cc"
#[3,] "cc aa" "cc ab" "cc cc"
You can also use combn
on the unique elements of the two vectors if you don't want the repeating elements (e.g. aa aa
)
vals <- c( c("aa", "ab", "cc"), c("aa", "ab", "cc") )
vals <- unique( vals )
combn( vals , 2 )
# [,1] [,2] [,3]
#[1,] "aa" "aa" "ab"
#[2,] "ab" "cc" "cc"
R - Expand Grid Without Duplicates
In RcppAlgos
*, there is a function called comboGrid
that does the trick:
library(RcppAlgos) ## as of v2.4.3
comboGrid(X1, X2, X3, repetition = F)
# Var1 Var2 Var3
# [1,] "x" "A" "C"
# [2,] "x" "A" "G"
# [3,] "x" "A" "y"
# [4,] "x" "B" "C"
# [5,] "x" "B" "G"
# [6,] "x" "B" "y"
# [7,] "x" "C" "G"
# [8,] "x" "C" "y"
# [9,] "y" "A" "C"
# [10,] "y" "A" "G"
# [11,] "y" "B" "C"
# [12,] "y" "B" "G"
# [13,] "y" "C" "G"
# [14,] "z" "A" "C"
# [15,] "z" "A" "G"
# [16,] "z" "A" "y"
# [17,] "z" "B" "C"
# [18,] "z" "B" "G"
# [19,] "z" "B" "y"
# [20,] "z" "C" "G"
# [21,] "z" "C" "y"
Large Test
set.seed(42)
rnd_lst <- lapply(1:11, function(x) {
sort(sample(LETTERS, sample(26, 1)))
})
## Number of results that expand.grid would return if your machine
## had enough memory... over 300 trillion!!!
prettyNum(prod(lengths(rnd_lst)), big.mark = ",")
# [1] "365,634,846,720"
exp_grd_test <- expand.grid(rnd_lst)
# Error: vector memory exhausted (limit reached?)
system.time(cmb_grd_test <- comboGrid(rnd_lst, repetition=FALSE))
# user system elapsed
# 9.866 0.330 10.196
dim(cmb_grd_test)
# [1] 3036012 11
head(cmb_grd_test)
# Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11
# [1,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "K"
# [2,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "L"
# [3,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "M"
# [4,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "N"
# [5,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "O"
# [6,] "A" "E" "C" "B" "D" "G" "F" "H" "J" "I" "P"
* I am the author of RcppAlgos
Use outer instead of expand.grid
Using rep.int
:
expand.grid.alt <- function(seq1,seq2) {
cbind(rep.int(seq1, length(seq2)),
c(t(matrix(rep.int(seq2, length(seq1)), nrow=length(seq2)))))
}
expand.grid.alt(seq_len(nrow(dat)), seq_len(ncol(dat)))
In my computer is like 6 times faster than expand.grid
.
How to generate an output satisfied with specific conditions from expand.grid in R
A slight modification to Vincent Zoonekynd's will take care of non-numerical factors:
a <- c(1,2,3,"X","Y","M")
eg <- expand.grid(a,a)
eg2 <- eg[as.character(eg$Var1) < as.character(eg$Var2), ]
Basically, what you need is to use string comparison instead of "plain" comparison that doesn't work on factor variables.
How to get only unique combinations of variables where entries can be in either variable
The combn
function will give you all n
-combinations of elements from a vector, however it does not match elements with themselves. You can add that result on fairly easily, Thus you can get the combinations you want with
cbind(combn(j,2), rbind(j,j))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# j "a" "a" "a" "b" "b" "c" "a" "b" "c" "d"
# j "b" "c" "d" "c" "d" "d" "a" "b" "c" "d"
combinations of combinations in R
So I think I may have cracked it. I've pillaged a couple of answers to other questions. There's a function here called expand.grid.unique which removes duplicates if you put the same vector into expand.grid twice. And there's one here, called expand.grid.df which I'm not even going to pretend to understand which expands expand.grid to work on dataframes. However, combined, they do what I want them to do.
upVariables<-c("up1", "up2", "up3", "up4", "up5")
downVariables<-c("down1", "down2", "down3", "down4", "down5")
ratioGroups<-data.frame(matrix(ncol=2, nrow=0))
colnames(ratioGroups)<-c("mix1","mix2")
ups<-expand.grid.unique(upVariables,upVariables)
downs<-expand.grid.unique(downVariables,downVariables)
comboList<-expand.grid.df(ups,downs)
comboList <- data.frame(lapply(comboList, as.character), stringsAsFactors=FALSE)
colnames(comboList)<-c("u1","u2","d1","d2")
There's a bunch of faffing about in there converting everything back to strings because everything gets converted to factors for some reason.
If I put Jota's answer into a function:
getGroups<-function(line){
n<-2 #the number ratios being used.
combos <- expand.grid(as.character(line[1:2]), as.character(line[3:4]))
combos <- combos[with(combos, order(Var1)), ] # use dplyr::arrange if you prefer
mat <- matrix(1:n^2, byrow = TRUE, nrow = n)
for(j in 2:nrow(mat) ) mat[j, ] <- mat[j, c(j:ncol(mat), 1:(j - 1))]
pairs<-(split(combos[c(mat), ], rep(1:n, each = n)))
collapsed<-sapply(lapply(pairs, apply, 1, paste, collapse = '_'), paste, collapse = '-')
}
I can then use
ratiosGroups<-as.vector(apply(comboList,1,getGroups))
to return a list of all possible combinations. I'm guessing this still isn't the best way to achieve my larger goal, but it's getting there.
All Possible Pairs between Two Vectors in R Without Replacement
It seems a permutation problem, which might be solved like below
> library(pracma)
> paste0(v1, t(perms(v2)))
[1] "AZ" "BY" "CX" "AZ" "BX" "CY" "AY" "BZ" "CX" "AY" "BX" "CZ" "AX" "BY" "CZ"
[16] "AX" "BZ" "CY"
or
out <- data.frame(
Var1 = v1,
Var2 = c(t(perms(v2))),
Match = ceiling(seq(factorial(length(v2)) * length(v2)) / length(v1))
)
which gives
> out
Var1 Var2 Match
1 A Z 1
2 B Y 1
3 C X 1
4 A Z 2
5 B X 2
6 C Y 2
7 A Y 3
8 B Z 3
9 C X 3
10 A Y 4
11 B X 4
12 C Z 4
13 A X 5
14 B Y 5
15 C Z 5
16 A X 6
17 B Z 6
18 C Y 6
How to expand.grid on vectors sets rather than single elements
We could try
do.call(cbind,lapply(expand.grid(list(a1, a2), list(b1,b2)),
function(x) matrix(do.call(c, x), 4, 2, byrow=TRUE)))
# [,1] [,2] [,3] [,4]
#[1,] 11 12 31 32
#[2,] 21 22 31 32
#[3,] 11 12 41 42
#[4,] 21 22 41 42
Related Topics
Plotting Pca Biplot with Ggplot2
Bigrams Instead of Single Words in Termdocument Matrix Using R and Rweka
Select Row with Most Recent Date by Group
Plotly: Updating Data with Dropdown Selection
Read and Rbind Multiple CSV Files
How to Change the Color in Geom_Point or Lines in Ggplot
R: How to Rescale My Matrix by Column
Any Way to Make Plot Points in Scatterplot More Transparent in R
How to Programmatically Extract/Unzip a .7Z (7-Zip) File with R
How to Test If List Element Exists
How to 'Print' or 'Cat' When Using Parallel
Use Ggpairs to Create This Plot
Improve Centering County Names Ggplot & Maps
Ordering of Points in R Lines Plot
Converting Factors to Binary in R