Function to Split a Matrix into Sub-Matrices in R

Function to split a matrix into sub-matrices in R

If you have a 16x12 array like this

mb <- structure(c("a1", "a2", "a3", "a4", "e1", "e2", "e3", "e4", "i1", 
"i2", "i3", "i4", "m1", "m2", "m3", "m4", "a5", "a6", "a7", "a8",
"e5", "e6", "e7", "e8", "i5", "i6", "i7", "i8", "m5", "m6", "m7",
"m8", "a9", "a10", "a11", "a12", "e9", "e10", "e11", "e12", "i9",
"i10", "i11", "i12", "m9", "m10", "m11", "m12", "b1", "b2", "b3",
"b4", "f1", "f2", "f3", "f4", "j1", "j2", "j3", "j4", "n1", "n2",
"n3", "n4", "b5", "b6", "b7", "b8", "f5", "f6", "f7", "f8", "j5",
"j6", "j7", "j8", "n5", "n6", "n7", "n8", "b9", "b10", "b11",
"b12", "f9", "f10", "f11", "f12", "j9", "j10", "j11", "j12",
"n9", "n10", "n11", "n12", "c1", "c2", "c3", "c4", "g1", "g2",
"g3", "g4", "k1", "k2", "k3", "k4", "o1", "o2", "o3", "o4", "c5",
"c6", "c7", "c8", "g5", "g6", "g7", "g8", "k5", "k6", "k7", "k8",
"o5", "o6", "o7", "o8", "c9", "c10", "c11", "c12", "g9", "g10",
"g11", "g12", "k9", "k10", "k11", "k12", "o9", "o10", "o11",
"o12", "d1", "d2", "d3", "d4", "h1", "h2", "h3", "h4", "l1",
"l2", "l3", "l4", "p1", "p2", "p3", "p4", "d5", "d6", "d7", "d8",
"h5", "h6", "h7", "h8", "l5", "l6", "l7", "l8", "p5", "p6", "p7",
"p8", "d9", "d10", "d11", "d12", "h9", "h10", "h11", "h12", "l9",
"l10", "l11", "l12", "p9", "p10", "p11", "p12"), .Dim = c(16L,
12L))

You can define matsplitter as

matsplitter<-function(M, r, c) {
rg <- (row(M)-1)%/%r+1
cg <- (col(M)-1)%/%c+1
rci <- (rg-1)*max(cg) + cg
N <- prod(dim(M))/r/c
cv <- unlist(lapply(1:N, function(x) M[rci==x]))
dim(cv)<-c(r,c,N)
cv
}

Then

matsplitter(mb,4,3)

will return (output clipped)

, , 1

[,1] [,2] [,3]
[1,] "a1" "a5" "a9"
[2,] "a2" "a6" "a10"
[3,] "a3" "a7" "a11"
[4,] "a4" "a8" "a12"

, , 2

[,1] [,2] [,3]
[1,] "b1" "b5" "b9"
[2,] "b2" "b6" "b10"
[3,] "b3" "b7" "b11"
[4,] "b4" "b8" "b12"

, , 3

[,1] [,2] [,3]
[1,] "c1" "c5" "c9"
[2,] "c2" "c6" "c10"
[3,] "c3" "c7" "c11"
[4,] "c4" "c8" "c12"

...

R: Convert matrix to list of submatrices

Just proceed as you did in your question using Map to iterate on the beginning index and finishing index:

p = 3
Map(function(u,v) m[u:v,], seq(1,nrow(m),p), seq(p,nrow(m),p))

#[[1]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 14 8 5 10 9
#[2,] 10 4 5 7 8
#[3,] 3 3 6 7 3

#[[2]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 4 8 12 1 1
#[2,] 4 2 13 1 11
#[3,] 6 2 4 1 12

#[[3]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 11 12 8 5 7
#[2,] 3 6 2 6 2
#[3,] 13 13 10 7 12

#[[4]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 9 7 12 8 9
#[2,] 10 8 13 14 13
#[3,] 12 6 11 4 11

Split matrix into submatrices

Perhaps a strategy like this could work

submat <- function(m, nrow, ncol) {
stopifnot(nrow(m)>=nrow, ncol(m)>=ncol)
rowstarts<-1:(nrow(m)-nrow+1)
colstarts<-1:(ncol(m)-ncol+1)
ss <- function(r, c) {
m[r:(r+nrow-1), c:(c+ncol-1), drop=FALSE]
}
with(expand.grid(r=rowstarts, c=colstarts), mapply(ss, r, c, SIMPLIFY=FALSE))
}

submat(M, 4, 4)

we determine where the possible start indexes for the rows and columns are, then we use expand.grid() to generate all possible combinations of such starting values, then we use mapply to extract every possible submatrix with those starting positions.

Split a matrix into submatrices by rownames

We can split the sequence of rows by the row names and then subset the rows of the matrix using the index.

lapply(split(1:nrow(m), rownames(m)), function(i) m[i,]) 

Splitting a matrix row based

One option is to create a logical index with rep and then use that to split the sequence of rows of matrix, subset the matrix based on the index vector in the list

out <-lapply(split(seq_len(nrow(m1)), rep(rep(c(TRUE, FALSE), c(8, 2)), 
length.out = nrow(m1))), function(i) m1[i, ] )

Also, as @user20650 mentioned in the comments, ?split.data.frame can be used on matrices as well (based on documentation)

The data frame method can also be used to split a matrix into a list of matrices, and the replacement form likewise, provided they are invoked explicitly.

out1 <- split.data.frame(m1, rep(rep(c(TRUE, FALSE), c(8, 2)), 
length.out = nrow(m1)))

data

set.seed(24)
m1 <- matrix(rnorm(100 * 1024), nrow = 100, ncol = 1024)

How can I separate a matrix into smaller ones in R?

If you have a matrix A, this will get the first two columns when the third column is 1:

A[A[,3] == 1,c(1,2)]

You can use this to obtain matrices for any value in the third column.

Explanation: A[,3] == 1 returns a vector of booleans, where the i-th position is TRUE if A[i,3] is 1. This vector of booleans can be used to index into a matrix to extract the rows we want.

Disclaimer: I have very little experience with R, this is the MATLAB-ish way to do it.

Split matrices in R

We create a grouping column to split

n <- 10
grp <- (seq_len(nrow(X)) - 1) %/% n + 1
split(as.data.frame(X), grp)

Or use index to subset the rows

lapply(seq(1, nrow(X), by =  n), function(i) X[i:(i+n -1), ])

data

X <- matrix(1:40, ncol = 2)

Partition matrix into N equally-sized chunks with R

Here's an attempt in base R. Calculate "pretty" cut values for the sequence of rows using pretty. Categorized the sequence of row numbers with cut and return a list of the the sequence split at the cut values with split. Finally, run through a list of the split row values using lapply extract the matrix subsets with [.

lapply(split(seq_len(nrow(data)),
cut(seq_len(nrow(data)), pretty(seq_len(nrow(data)), number_of_chunks))),
function(x) data[x, ])
$`(0,2]`
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 9 17 25 33 41 49 57 65 73
[2,] 2 10 18 26 34 42 50 58 66 74

$`(2,4]`
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 11 19 27 35 43 51 59 67 75
[2,] 4 12 20 28 36 44 52 60 68 76

$`(4,6]`
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 5 13 21 29 37 45 53 61 69 77
[2,] 6 14 22 30 38 46 54 62 70 78

$`(6,8]`
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 7 15 23 31 39 47 55 63 71 79
[2,] 8 16 24 32 40 48 56 64 72 80

Roll this into a function:

array_split <- function(data, number_of_chunks) {
rowIdx <- seq_len(nrow(data))
lapply(split(rowIdx, cut(rowIdx, pretty(rowIdx, number_of_chunks))), function(x) data[x, ])
}

Then, you can use

array_split(data=data, number_of_chunks=number_of_chunks)

to return the same result as above.


A nice simplification suggested by @user20650 is

split.data.frame(data,
cut(seq_len(nrow(data)), pretty(seq_len(nrow(data)), number_of_chunks)))

A surprise to me, split.data.frame returns a list of matrices when its first argument is a matrix.



Related Topics



Leave a reply



Submit