Rcpp pass by reference vs. by value
They key is 'proxy model' -- your xa
really is the same memory location as your original object so you end up changing your original.
If you don't want that, you should do one thing: (deep) copy using the clone()
method, or maybe explicit creation of a new object into which the altered object gets written. Method two does not do that, you simply use two differently named variables which are both "pointers" (in the proxy model sense) to the original variable.
An additional complication, though, is in implicit cast and copy when you pass an int vector (from R) to a NumericVector type: that creates a copy, and then the original no longer gets altered.
Here is a more explicit example, similar to one I use in the tutorials or workshops:
library(inline)
f1 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
Rcpp::NumericVector xa(a);
int n = xa.size();
for(int i=0; i < n; i++) {
if(xa[i]<0) xa[i] = 0;
}
return xa;
')
f2 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
Rcpp::NumericVector xa(a);
int n = xa.size();
Rcpp::NumericVector xr(a); // still points to a
for(int i=0; i < n; i++) {
if(xr[i]<0) xr[i] = 0;
}
return xr;
')
p <- seq(-2,2)
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))
p <- as.numeric(seq(-2,2))
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))
and this is what I see:
edd@max:~/svn/rcpp/pkg$ r /tmp/ari.r
Loading required package: methods
[1] "integer"
p
[1,] 0 -2
[2,] 0 -1
[3,] 0 0
[4,] 1 1
[5,] 2 2
p
[1,] 0 -2
[2,] 0 -1
[3,] 0 0
[4,] 1 1
[5,] 2 2
[1] "numeric"
p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
edd@max:~/svn/rcpp/pkg$
So it really matters whether you pass int-to-float or float-to-float.
Within C++ functions, how are Rcpp objects passed to other functions (by reference or by copy)?
There is no deep copy in Rcpp unless you ask for it with clone
. When you pass by value, you are making a new List
object but it uses the same underlying R object.
So the different is small between pass by value and pass by reference.
However, when you pass by value, you have to pay the price for protecting the underlying object one more time. It might incur extra cost as for this Rcpp relies on the recursive not very efficient R_PreserveObject
.
My guideline would be to pass by reference whenever possible so that you don't pay extra protecting price. If you know that ABCfun2
won't change the object, I'd advise passing by reference to const : ABCfun2( const List& )
. If you are going to make changes to the List
, then I'd recommend using ABCfun2( List& )
.
Consider this code:
#include <Rcpp.h>
using namespace Rcpp ;
#define DBG(MSG,X) Rprintf("%20s SEXP=<%p>. List=%p\n", MSG, (SEXP)X, &X ) ;
void fun_copy( List x, const char* idx ){
x[idx] = "foo" ;
DBG( "in fun_copy: ", x) ;
}
void fun_ref( List& x, const char* idx ){
x[idx] = "bar" ;
DBG( "in fun_ref: ", x) ;
}
// [[Rcpp::export]]
void test_copy(){
// create a list of 3 components
List data = List::create( _["a"] = 1, _["b"] = 2 ) ;
DBG( "initial: ", data) ;
fun_copy( data, "a") ;
DBG( "\nafter fun_copy (1): ", data) ;
// alter the 1st component of ths list, passed by value
fun_copy( data, "d") ;
DBG( "\nafter fun_copy (2): ", data) ;
}
// [[Rcpp::export]]
void test_ref(){
// create a list of 3 components
List data = List::create( _["a"] = 1, _["b"] = 2 ) ;
DBG( "initial: ", data) ;
fun_ref( data, "a") ;
DBG( "\nafter fun_ref (1): ", data) ;
// alter the 1st component of ths list, passed by value
fun_ref( data, "d") ;
DBG( "\nafter fun_ref (2): ", data) ;
}
All I'm doing is pass a list to a function, update it and print some information about the pointer to the underlying R object and the pointer to the List object ( this
) .
Here are the results of what happens when I call test_copy
and test_ref
:
> test_copy()
initial: SEXP=<0x7ff97c26c278>. List=0x7fff5b909fd0
in fun_copy: SEXP=<0x7ff97c26c278>. List=0x7fff5b909f30
after fun_copy (1): SEXP=<0x7ff97c26c278>. List=0x7fff5b909fd0
$a
[1] "foo"
$b
[1] 2
in fun_copy: SEXP=<0x7ff97b2b3ed8>. List=0x7fff5b909f20
after fun_copy (2): SEXP=<0x7ff97c26c278>. List=0x7fff5b909fd0
$a
[1] "foo"
$b
[1] 2
We start with an existing list associated with an R object.
initial: SEXP=<0x7fda4926d278>. List=0x7fff5bb5efd0
We pass it by value to fun_copy
so we get a new List
but using the same underlying R object:
in fun_copy: SEXP=<0x7fda4926d278>. List=0x7fff5bb5ef30
We exit of fun_copy
. same underlying R object again, and back to our original List
:
after fun_copy (1): SEXP=<0x7fda4926d278>. List=0x7fff5bb5efd0
Now we call again fun_copy
but this time updating a component that was not on the list: x["d"]="foo"
.
in fun_copy: SEXP=<0x7fda48989120>. List=0x7fff5bb5ef20
List
had no choice but to create itself a new underlying R object, but this object is only underlying to the local List
. Therefore when we get out of get_copy
, we are back to our original List
with its original underlying SEXP
.
after fun_copy (2): SEXP=<0x7fda4926d278>. List=0x7fff5bb5efd0
The key thing here is that the first time "a"
was already on the list, so we updated the data directly. Because the local object to fun_copy
and the outer object from test_copy
share the same underlying R object, modifications inside fun_copy
was propagated.
The second time, fun_copy
grows its local List
object, associating it with a brand new SEXP
which does not propagate to the outer function.
Now consider what happens when you pass by reference :
> test_ref()
initial: SEXP=<0x7ff97c0e0f80>. List=0x7fff5b909fd0
in fun_ref: SEXP=<0x7ff97c0e0f80>. List=0x7fff5b909fd0
after fun_ref(1): SEXP=<0x7ff97c0e0f80>. List=0x7fff5b909fd0
$a
[1] "bar"
$b
[1] 2
in fun_ref: SEXP=<0x7ff97b5254c8>. List=0x7fff5b909fd0
after fun_ref(2): SEXP=<0x7ff97b5254c8>. List=0x7fff5b909fd0
$a
[1] "bar"
$b
[1] 2
$d
[1] "bar"
There is only one List
object 0x7fff5b909fd0
. When we have to get a new SEXP
in the second call, it correctly gets propagated to the outer level.
To me, the behavior you get when passing by references is much easier to reason with.
Rcpp and R: pass by reference
No, R does not make a copy immediately, only if it is necessary, i.e., copy-on-modify:
x <- 1
tracemem(x)
#[1] "<0000000009A57D78>"
y <- x
tracemem(x)
#[1] "<0000000009A57D78>"
x <- 2
tracemem(x)
#[1] "<00000000099E9900>"
Since you modify M by reference outside R, R can't know that a copy is necessary. If you want to ensure a copy is made, you can use data.table::copy
. Or avoid the side effect in your C++ code, e.g., make a deep copy there (by using clone
).
Rcpp Update matrix passed by reference and return the update in R
Let's start by reiterating that this is probably bad practice. Don't use void
, return your changed object -- a more common approach.
That said, you can make it work in either way. For RcppArmadillo, pass by (explicit) reference. I get the desired behaviour
> sourceCpp("/tmp/so.cpp")
> M1 <- M2 <- matrix(0, 2, 2)
> bar(M1)
> M1
[,1] [,2]
[1,] 42 0
[2,] 0 0
> foo(M2)
> M2
[,1] [,2]
[1,] 42 0
[2,] 0 0
>
out of this short example:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
void bar(Rcpp::NumericMatrix M) {
M(0,0) = 42;
}
// [[Rcpp::export]]
void foo(arma::mat M) {
M(0,0) = 42;
}
/*** R
M1 <- M2 <- matrix(0, 2, 2)
bar(M1)
M1
foo(M2)
M2
*/
function pass by reference in RcppArmadillo
A double
is not a native R type (so there is always a copy being made) and no pass-through reference is possible.
Instead, use Rcpp::NumericVector
which is a proxy for a SEXP
type. This works:
R> sourceCpp("/tmp/so44047145.cpp")
R> x = 1.0
R> myfun(x)
Inside myfun: x = 0.0361444
R> x
[1] 0.0361444
R>
Below is the full code with another small repair or two:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
//[[Rcpp::export]]
void myfun(Rcpp::NumericVector &x){
arma::mat X = arma::randu<arma::mat>(5,5);
arma::mat Y = X.t()*X;
arma::mat R1 = chol(Y);
x[0] = arma::det(R1);
Rcpp::Rcout << "Inside myfun: x = " << x << std::endl;
}
/*** R
x = 1.0 // initialize x
myfun(x) // update x to a new value calculated internally
x // return the new x; it should be different from 1
*/
Related Topics
What Are Replacement Functions in R
Converting Multiple Columns from Character to Numeric Format in R
Error - Replacement Has [X] Rows, Data Has [Y]
Replace Na in Column With Value in Adjacent Column
Horizontal/Vertical Line in Plotly
Ggplot, Drawing Line Between Points Across Facets
Use a Value from the Previous Row in an R Data.Table Calculation
R Install.Packages Returns "Failed to Create Lock Directory"
Pass Column Name in Data.Table Using Variable
How to Divide Each Row of a Matrix by Elements of a Vector in R
How to Create an R Function Programmatically
How to Use Reference Variables by Character String in a Formula