Fast Large Matrix Multiplication in R

fast large matrix multiplication in R

There are many ways to approach this depending upon your code, effort, and hardware.

  1. Use the 'best' function for the job

The simplest is to use crossprod which is the same as t(a)%*% b (Note - this will only be a small increase in speed)

crossprod(a,b) 

  1. Use Rcpp (and likely RcppEigen/RcppArmadillo).

C++ will likely greater increase the speed of your code. Using linear algebra libraries will likely also help this further (hence Eigen and Armadillo). This however assumes you are willing to write some C++.


  1. Use a better backend

After this point you are looking at BLAS backends such as OpenBLAS, Atlas, etc. Hooking these up to R varies depending upon your OS. It is quite easy if you are using a Debian system like Ubuntu. You can find a demo here. These can sometimes be leveraged further by libraries such as Armadillo and Eigen.


  1. GPU Computing

If you have a GPU (e.g. AMD, NVIDIA, etc.) you can leverage the many cores within to greatly speed up your computations. There are a few that could be useful including gpuR, gputools, and gmatrix

EDIT - to address @jenesaiquoi comment on benefit of Rcpp

test.cpp

// [[Rcpp::depends(RcppArmadillo, RcppEigen)]]

#include <RcppArmadillo.h>
#include <RcppEigen.h>

// [[Rcpp::export]]
SEXP armaMatMult(arma::mat A, arma::mat B){
arma::mat C = A * B;

return Rcpp::wrap(C);
}

// [[Rcpp::export]]
SEXP eigenMatMult(Eigen::MatrixXd A, Eigen::MatrixXd B){
Eigen::MatrixXd C = A * B;

return Rcpp::wrap(C);
}

// [[Rcpp::export]]
SEXP eigenMapMatMult(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;

return Rcpp::wrap(C);
}

test.R

library(Rcpp)

A <- matrix(rnorm(10000), 100, 100)
B <- matrix(rnorm(10000), 100, 100)

library(microbenchmark)
sourceCpp("test.cpp")
microbenchmark(A%*%B, armaMatMult(A, B), eigenMatMult(A, B), eigenMapMatMult(A, B))

Unit: microseconds
expr min lq mean median uq max neval
A %*% B 885.846 892.1035 933.7457 901.1010 938.9255 1411.647 100
armaMatMult(A, B) 846.688 857.6320 915.0717 866.2265 893.7790 1421.557 100
eigenMatMult(A, B) 205.978 208.1295 233.1882 217.0310 229.4730 369.369 100
eigenMapMatMult(A, B) 192.366 194.9835 207.1035 197.5405 205.2550 366.945 100

Matrix multiplication speeds in R as fast as in Python?

I use MRO and python with anaconda and the MKL BLAS. Here are my results for the same data generating process, i.e. np.random.rand ('float64') or rnorm and identical dimensions (average and standard deviation over 10 replications ):

Python:

np.dot(A, B) # 1.3616 s (sd = 0.1776)

R:

Bt = t(B)
a = A %*% B # 2.0285 s (sd = 0.1897)
acp = tcrossprod(A, Bt) # 1.3098 s (sd = 0.1206)
identical(acp, a) # TRUE

Why is this naive matrix multiplication faster than base R's?

A quick glance in names.c (here in particular) points you to do_matprod, the C function that is called by %*% and which is found in the file array.c. (Interestingly, it turns out, that both crossprod and tcrossprod dispatch to that same function as well). Here is a link to the code of do_matprod.

Scrolling through the function, you can see that it takes care of a number of things your naive implementation does not, including:

  1. Keeps row and column names, where that makes sense.
  2. Allows for dispatch to alternative S4 methods when the two objects being operated on by a call to %*% are of classes for which such methods have been provided. (That's what's happening in this portion of the function.)
  3. Handles both real and complex matrices.
  4. Implements a series of rules for how to handle multiplication of a matrix and a matrix, a vector and a matrix, a matrix and a vector, and a vector and a vector. (Recall that under cross-multiplication in R, a vector on the LHS is treated as a row vector, whereas on the RHS, it is treated as a column vector; this is the code that makes that so.)

Near the end of the function, it dispatches to either of matprod or or cmatprod. Interestingly (to me at least), in the case of real matrices, if either matrix might contain NaN or Inf values, then matprod dispatches (here) to a function called simple_matprod which is about as simple and straightforward as your own. Otherwise, it dispatches to one of a couple of BLAS Fortran routines which, presumably are faster, if uniformly 'well-behaved' matrix elements can be guaranteed.

Fastest way for multiplying a matrix to a vector

sweep seems to run a bit faster on my machine

sweep(mat, 2, v, FUN = "*")

Some benchmarks:

> microbenchmark(mat %*% diag(v),sweep(mat, 2, v, FUN = "*"))

Unit: milliseconds
expr min lq median uq max neval
%*% 214.66700 226.95551 231.2366 255.78493 349.1911 100
sweep 42.42987 44.72254 62.9990 70.87403 127.2869 100

Speed up sparse matrix multiplication in R

You can create a sparse matrix using the Matrix package. Matrix/vector multiplication may be faster in this case. For example:

library(Matrix)
library(tictoc)
set.seed(123)
v <- sample(1e4)
m <- Matrix(sample(c(0, 1), length(v) ^ 2, T, c(.99, .01)),
length(v), length(v), sparse = F)
sm <- Matrix(m, sparse = T)
tic("dense")
x <- m %*% v
toc()
#> dense: 0.094 sec elapsed
tic("sparse")
y <- sm %*% v
toc()
#> sparse: 0.006 sec elapsed

Fastest way to hadamard multiply all matrix columns with another matrix

You can try apply(A, 2, '*', B) and to come the the same like colmat_prod use array(apply(A, 2, '*', B), c(dim(B), ncol(A))):

identical(array(apply(A, 2, '*', B), c(dim(B), ncol(A))), colmat_prod(A, B))
#[1] TRUE

Another option is to use rep for the columns of A:

array(A[,rep(seq_len(ncol(A)), each=ncol(B))] * as.vector(B), c(dim(B), ncol(A)))

Timings:

library(microbenchmark)
microbenchmark(colmat_prod(A1, B1),
colmat_prod_vec(A1, B1),
array(apply(A1, 2, '*', B1), c(dim(B1), ncol(A1))),
array(A1[,rep(seq_len(ncol(A1)), each=ncol(B1))] * as.vector(B1), c(dim(B1), ncol(A1))),
times = 10)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
# colmat_prod(A1, B1) 831.5437 857.0305 910.5694 878.6842 999.5354 1025.0915 10 c
# colmat_prod_vec(A1, B1) 981.9241 1010.9482 1174.1700 1162.7004 1319.3478 1444.6158 10 d
# array(apply(A1, 2, "*", B1), c(dim(B1), ncol(A1))) 716.1469 725.7862 765.4987 732.2520 789.3843 907.4417 10 b
# array(A1[, rep(seq_len(ncol(A1)), each = ncol(B1))] * as.vector(B1), c(dim(B1), ncol(A1))) 404.8460 406.2848 430.4043 428.2685 458.9400 462.0634 10 a


Related Topics



Leave a reply



Submit