Why Is This Naive Matrix Multiplication Faster Than Base R'S

Why is this naive matrix multiplication faster than base R's?

A quick glance in names.c (here in particular) points you to do_matprod, the C function that is called by %*% and which is found in the file array.c. (Interestingly, it turns out, that both crossprod and tcrossprod dispatch to that same function as well). Here is a link to the code of do_matprod.

Scrolling through the function, you can see that it takes care of a number of things your naive implementation does not, including:

Keeps row and column names, where that makes sense.
Allows for dispatch to alternative S4 methods when the two objects being operated on by a call to %*% are of classes for which such methods have been provided. (That's what's happening in this portion of the function.)
Handles both real and complex matrices.
Implements a series of rules for how to handle multiplication of a matrix and a matrix, a vector and a matrix, a matrix and a vector, and a vector and a vector. (Recall that under cross-multiplication in R, a vector on the LHS is treated as a row vector, whereas on the RHS, it is treated as a column vector; this is the code that makes that so.)

Near the end of the function, it dispatches to either of matprod or or cmatprod. Interestingly (to me at least), in the case of real matrices, if either matrix might contain NaN or Inf values, then matprod dispatches (here) to a function called simple_matprod which is about as simple and straightforward as your own. Otherwise, it dispatches to one of a couple of BLAS Fortran routines which, presumably are faster, if uniformly 'well-behaved' matrix elements can be guaranteed.

why Rcpp is slower than R when using matrix multiply?

This is the opposite but expected result from what was seen for matrix-vector multiplication. Here R is using BLAS to do all the heavy work, which might even work in parallel. You are throwing away all the optimized memory management done in the BLAS library by using your naive matrix multiplication.

Instead of trying to reinvent the low-level stuff like matrix multiplication, you could try to implement larger parts of your code using something like RcppArmadillo, which uses the same BLAS library as R but also (not only!) offers a convenient syntax on top of that.

Matrix-multiplication with RcppArmadillo: Why is it not faster?

R itself farms this out to LAPACK/BLAS -- and so does code linked to R that calls via the LAPACK/BLAS. So yes, both approaches will run the same code and differences are only due to the small overhead.

There are many tutorials out there that tell you how to change your LAPACK libraries. This depends of course on the operating system. Start maybe with the R Installation and Administration manual and its appendices.

Why is this Rcpp code not faster than pure R?

When something can't be true you should check closely what is going on :)

Here your R function called your C++ function as you were not careful about where fib() was coming from. A repaired, more explicit version follow.

Code

fibR <- function(n) {
    if (n < 2)
        n
    else
        fibR(n - 1) + fibR(n - 2)
}

library(Rcpp)
Rcpp::cppFunction('int fibRcpp(int x) {
   if ((x < 2))
       return(x);
   else
       return(fibRcpp(x-1) + fibRcpp(x-2));
}')

library(rbenchmark)
benchmark(fibRcpp(30), fibR(30))[,1:3]

Output

         test replications elapsed
2    fibR(30)          100  80.846
1 fibRcpp(30)          100   0.153

This is a rather well-studied example. Look at the directory examples/Misc/ in your Rcpp installation for examples last updated years ago. Also look at prior questions here on SO.

Why is matrix multiplication faster with Repa than with hmatrix?

Perhaps you are using a non-optimized LAPACK library. In my computer, using libatlas-base, the running time is ~0.4s.

$ cat matrixproduct.hs

import Numeric.LinearAlgebra

main = do
    let a = (1000><1000) $ replicate (1000*1000) (1::Double)
        b = konst 1 (1000,1000)
    print $ a@@>(100,100)
    print $ b@@>(100,100)
    print $ (a <> b) @@> (900,900)

$ ghc matrixproduct.hs -O

$ time ./matrixproduct

1.0
1.0
1000.0

real    0m0.331s
user    0m0.512s
sys     0m0.016s

Correct way to perform matrix multiplication

You have to transform the data frames into matrices:

D <- as.matrix(data.frame(X = c(1,1,-1,1), Y = c(1,-1,1,1), Z = c(1,1,1,-1)))
E <- as.matrix(data.frame(X = c(-1,0,1), Y = c(-1,1,0), Z = c(1,1,1)))
P <- D %*% E

Speed up sparse matrix multiplication in R

You can create a sparse matrix using the Matrix package. Matrix/vector multiplication may be faster in this case. For example:

library(Matrix)
library(tictoc)
set.seed(123)
v <- sample(1e4)
m  <- Matrix(sample(c(0, 1), length(v) ^ 2, T, c(.99, .01)),
         length(v), length(v), sparse = F)
sm <- Matrix(m, sparse = T)
tic("dense")
x <- m %*% v
toc()
#> dense: 0.094 sec elapsed
tic("sparse")
y <- sm %*% v
toc()
#> sparse: 0.006 sec elapsed

Why Is This Naive Matrix Multiplication Faster Than Base R'S