Generating a Very Large Matrix of String Combinations Using Combn() and Bigmemory Package

Generating a very large matrix of string combinations using combn() and bigmemory package

You could first find all 2-way combinations, and then just combine them with the 3d value while saving them every time. This takes a lot less memory:

combn.mod <- function(x,fname){
tmp <- combn(x,2,simplify=F)
n <- length(x)
for ( i in x[-c(n,n-1)]){
# Drop all combinations that contain value i
id <- which(!unlist(lapply(tmp,function(t) i %in% t)))
tmp <- tmp[id]
# add i to all other combinations and write to file
out <- do.call(rbind,lapply(tmp,c,i))
write(t(out),file=fname,ncolumns=3,append=T,sep=",")
}
}

combn.mod(x,"F:/Tmp/Test.txt")

This is not as general as Joshua's answer though, it is specifically for your case. I guess it is faster -again, for this particular case-, but I didn't make the comparison. Function works on my computer using little over 50 Mb (roughly estimated) when applied to your x.

EDIT

On a sidenote: If this is for simulation purposes, I find it hard to believe that any scientific application needs 400+ million simulation runs. You might be asking the correct answer to the wrong question here...

PROOF OF CONCEPT :

I changed the write line by tt[[i]]<-out, added tt <- list() before the loop and return(tt) after it. Then:

> do.call(rbind,combn.mod(letters[1:5]))
[,1] [,2] [,3]
[1,] "b" "c" "a"
[2,] "b" "d" "a"
[3,] "b" "e" "a"
[4,] "c" "d" "a"
[5,] "c" "e" "a"
[6,] "d" "e" "a"
[7,] "c" "d" "b"
[8,] "c" "e" "b"
[9,] "d" "e" "b"
[10,] "d" "e" "c"

Why doesn't combn() using paste0 as the FUN give me the expected result (r)

Can you use this command and see if it works ? It worked for me.

combn(LETTERS[1:5],3, FUN=paste0, collapse = "")

Same output from combinations, comboGeneral and combn by converting matrix to list

For that you may use

comb2 <- map(1:2, ~combinations(names(mtcars), k = .x) %>% split(row(.))) %>% unlist(recursive = FALSE)
comb3 <- map(1:2, ~comboGeneral(names(mtcars), m = .x, FUN = c)) %>% unlist(recursive = FALSE)

comb2 happens to be a named list; if that's an issue, you may add extra %>% unname.

Creating a matrix with all combinations within a budget

These combinatorial objects are called partitions (see also here and even here), and their computation is implemented by the partitions package.

Depending on what you really want, use one of the following:

library(partitions)

## The first argument says you want to enumerate all partitions in which the
## second argument (5) is broken into three summands, each of which can take a
## maximum value of 5.
blockparts(rep(5,3),5) ## Equiv: blockparts(c(5,5,5), 5)
#
# [1,] 5 4 3 2 1 0 4 3 2 1 0 3 2 1 0 2 1 0 1 0 0
# [2,] 0 1 2 3 4 5 0 1 2 3 4 0 1 2 3 0 1 2 0 1 0
# [3,] 0 0 0 0 0 0 1 1 1 1 1 2 2 2 2 3 3 3 4 4 5

restrictedparts(5,3)
#
# [1,] 5 4 3 3 2
# [2,] 0 1 2 1 2
# [3,] 0 0 0 1 1

Equivalent of row() and col() for big.matrix in R

So, with Rcpp, you can do:

// [[Rcpp::depends(BH, bigmemory)]]
#include <bigmemory/MatrixAccessor.hpp>
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void fillBM(SEXP pBigMat) {

XPtr<BigMatrix> xpMat(pBigMat);
MatrixAccessor<double> macc(*xpMat);

int n = macc.nrow();
int m = macc.ncol();

for (int j = 0; j < m; j++) {
for (int i = j; i < n; i++) {
macc[j][i] = pow(i - j, 5) + 2;
}
}
}

/*** R
library(bigmemory)
k <- big.matrix(nrow = 8000, ncol = 8000, type = 'double', init = 0)
k.mat <- k[]

system.time(
fillBM(k@address)
)
k[1:5, 1:5]

system.time(
k.mat <- ifelse(row(k.mat) < col(k.mat), 0, (row(k.mat)-col(k.mat))^5 + 2)
)
k.mat[1:5, 1:5]
all.equal(k.mat, k[])
*/

The Rcpp function takes 2 sec while the R version (on a standard R matrix) takes 10 seconds (and much more memory).



Related Topics



Leave a reply



Submit