Calling R Function from C++

How to call C function from R?

I've searched stackoverflow first but I noticed there is no answer for that in here.

The general idea is (commands for linux, but same idea under other OS):

  1. Create function that will only take pointers to basic types and do everything by side-effects (returns void). eg:

    void addOneToVector(int* n, double* vector) {
    for (int i = 0; i < *n; ++i)
    vector[i] += 1.0;
    }
  2. Compile file C source as dynamic library, you can use R shortcut to do this:

    $ R CMD SHLIB lib.c
  3. Load dynamic library from R:

    dyn.load("foo.so")
  4. Call C functions using .C R function, IE:

    x = 1:3
    ret_val = .C("addOneToVector", n=length(x), vector=as.double(x))

It returns list from which you can get value of inputs after calling functions eg.

ret_val$x # 2, 3, 4

You can now wrap it to be able to use it from R easier.

There is a nice page describing whole process with more details here (also covering Fortran):

http://users.stat.umn.edu/~geyer/rc/

Calling the agrep .Internal C function from Rcpp

Great question. The long and short of it is "You cant" (in many cases) unless the function is visible in one of the header files in "src/include/". At least not that easily.

Not long ago I had a similar fun challenge, where I tried to get access to the do_docall function (called by do.call), and it is not a simple task. First of all, it is not directly possible to just #include <agrep.c> (or something similar). That file simply isn't available for inclusion, as it is not a part of the "src/include". It is compiled and the uncompiled file is removed (not to mention that one should never "include" a .c file).

If one is willing to go the mile, then the next step one could look at is "copying" and "altering" the source code. Basically find the function in "src/main/agrep.c", copy it into your package and then fix any errors you find.

Problems with this approach:

  1. As documented in R-exts the internal structures of sexprec_info is not made public (this is the base structure for all objects in R). Many internal function use the fields within this structure, so one has to "copy" the structure into your source code, to make it public to your code specifically.
  2. If you ever #include <Rcpp.h> prior to this file, you will need to go through each and every call to internal functions and likely add either R_ or Rf_.
  3. The function may contain calls to other "internal" functions, that further needs to be copied and altered for it to work.
  4. You will also need to get a clear understanding of what CDR, CAR and similar does. The internal functions have a documented structure, where the first argument contains the full call passed to the function, and function like those 2 are used to access parts of the call.
    I did myself a solid and rewrote do_docall changing the input format, to avoid having to consider this. But this takes time. The alternative is to create a pairlist according to the documentation, set its type as a call-sexp (the exact name is lost to me at the moment) and pass the appropriate arguments for op, args and env.
  5. And lastly, if you go through the steps, and find that it is necessary to copy the internal structures of sexprec_info (as described later), then you will need to be very careful about when you include Rinternals and Rcpp, as any one of these causes your code to crash and burn in the most beautiful and silent way if you include your header and these in the wrong order! Note that this even goes for [[Rcpp::export]], which may indeed turn out to include them in the wrong arbitrary order!

If you are willing to go this far down the drainage, I would suggest carefully reading adv-R "R's C interface" and Chapter 2, 5 and 6 of R-ext and maybe even the R internal manual, and finally once that is done take a look at do_docall from src/main/coerce.c and compare it to the implementation in my repository cmdline.arguments/src/utils/{cmd_coerce.h, cmd_coerce.c}. In this version I have

  1. Added all the internal structures that are not public, so that I can access their unmodified form (unmodified by the current session).
    • This includes the table used to store the currently used SEXP's, that was used as a lookup. This caused a problem as I can't access the modified version, so my code is slightly altered with the old code blocked by the macro #if --- defined(CMDLINE_ARGUMENTS_MAYBE_IN_THE_FUTURE). Luckily the code causing a problem had a static answer, so I could work around this (but this might not always be the case).
  2. I added quite a few Rf_s as their macro version is not available (since I #include <Rcpp.h> at some point)
  3. The code has been split into smaller functions to make it more readable (for my own sake).
  4. The function has one additional argument (name), that is not used in the internal function, with some added errors (for my specific need).

This implementation will be frozen "for all time to come" as I've moved on to another branch (and this one is frozen for my own future benefit, if I ever want to walk down this path again).

I spent a few days scouring the internet for information on this and found 2 different posts, talking about how this could be achieved, and my approach basically copies this. Whether this is actually allowed in a cran package, is an whole other question (and not one that I will be testing out).

This approach goes again if you want to use not-public code from other packages. While often here it is as simple as "copy-paste" their files into your repository.

As a final side note, you mention the intend is to "speed up" your code for when you have to perform millions upon millions of calls to agrep. It seems that this is a time where one should consider performing the task in parallel. Even after going through the steps outlined above, creating N parallel sessions to take care of K evaluations each (say 100.000), would be the first step to reduce computing time. Of course each session should be given a batch and not a single call to agrep.

Call R function from C wrapper

This is very easy with Rcpp:

Rcpp::cppFunction("SEXP callFun(Function f) {
return f(1);
}")

callFun(function(x) x + 10)
# [1] 11

Calling R from C from R in mcmc

We do precisely this in the mcmc package on CRAN, http://cran.us.r-project.org/web/packages/mcmc/index.html . That link includes way to download the source code.

This package implements the Metropolis-Hastings algorithm. Specifically, C code handles running the MH loop, but calls a user supplied R function to evaluate the log unnormalized target density at each iteration, it will also call a user supplied output function.

I've run this code with very large models and datasets, so it's definitely feasible to run "large" MCMC estimations in R with this approach.

Calling R function from Rcpp

If you are willing to modify intecxx by hardcoding the call to inte inside the body, rather than trying to pass it as a parameter, you could use this approach:

#include <Rcpp.h>

/*** R
inte = function(x, y, a, b){
model = approxfun(x, y)
return(integrate(model, a, b)$value)
}

.x <- 1:10
set.seed(123)
.y <- rnorm(10)
*/

// [[Rcpp::export]]
double intecxx(Rcpp::NumericVector x, Rcpp::NumericVector y, double a, double b) {
Rcpp::NumericVector res;
Rcpp::Environment G = Rcpp::Environment::global_env();
Rcpp::Function inte = G["inte"];
res = inte(x, y, a, b);
return res[0];
}

I defined inte in the same source file as intecxx to ensure that it is available in the global environment, and therefore callable from within intecxx through G.

R> inte(.x, .y, 1, 10)
[1] 1.249325

R> intecxx(.x, .y, 1, 10)
[1] 1.249325

R> all.equal(inte(.x, .y, 1, 10),intecxx(.x, .y, 1, 10))
[1] TRUE

Calling C code from an R package, within C

The R_RegisterCCallable / R_GetCCallable solution pointed to by @BrodieG is probably better than the one below, at least when one can modify the package where registration is required and where the choice of function to call is straight-forward (the example below came from more-or-less complicated R code that chooses one of several functions to pass to C, much like lapply's FUN argument, where choice of function is much easier to implement in R than C). Also relevant is Linking to other packages when wanting to expose / access many functions.

A related possibility is to register your C functions in the rje package, using something like, in R_init_rje.c

#include <Rinternals.h>
#include <R_ext/Rdynload.h>

SEXP rje(SEXP who) {
Rprintf("Hello %s\n", CHAR(STRING_ELT(who, 0)));
return R_NilValue;
}

static const R_CallMethodDef callMethods[] = {
{".rje", (DL_FUNC) &rje, 1},
{NULL, NULL, 0}
};

void R_init_rje(DllInfo * info)
{
R_registerRoutines(info, NULL, callMethods, NULL, NULL);
}

and in the NAMESPACE

useDynLib(rje, .registration=TRUE)

The address of the C-level entry point is then available in R as

rje_c = getNativeSymbolInfo(".rje", PACKAGE="rje")

and can be used in your other package by using this as an argument to a C function, e.g.,

.Call(.use_rje, rje_c$address, "A User")

with

#include <Rinternals.h>
#include <R_ext/Rdynload.h>

/* convenience definition of the function template */
typedef SEXP RJE_C_FUN(SEXP who);

SEXP use_rje(SEXP rje_c_fun, SEXP who) {
/* retrieve the function pointer, using an appropriate cast */
RJE_C_FUN *fun = (RJE_C_FUN *) R_ExternalPtrAddr(rje_c_fun);
return fun(who);
}

It's too clumsy to illustrate this in a package, but the principle is illustrated by the following file rje.c

#include <Rinternals.h>
#include <R_ext/Rdynload.h>

/* convenience definition of the function template */
typedef SEXP RJE_C_FUN(SEXP who);

SEXP rje(SEXP who) {
Rprintf("Hello '%s'\n", CHAR(STRING_ELT(who, 0)));
return R_NilValue;
}

SEXP use_rje(SEXP rje_c_fun, SEXP who) {
/* retrieve the function pointer, using an appropriate cast */
RJE_C_FUN *fun = (RJE_C_FUN *) R_ExternalPtrAddr(rje_c_fun);
return fun(who);
}

static const R_CallMethodDef callMethods[] = {
{".rje", (DL_FUNC) &rje, 1},
{".use_rje", (DL_FUNC) &use_rje, 2},
{NULL, NULL, 0}
};

void R_init_rje(DllInfo * info)
{
R_registerRoutines(info, NULL, callMethods, NULL, NULL);
}

Compile with R CMD SHLIB rje.c, and use as

> dyn.load("rje.so")
> .Call(".use_rje", getNativeSymbolInfo("rje")$address, "A User")
Hello 'A User'
NULL


Related Topics



Leave a reply



Submit