Calling Custom Functions from Python Using Rpy2

Accessing a R user defined function in Python

Consider importing the abitrary R user-defined function as a package with rpy2's SignatureTranslatedAnonymousPackage (STAP):

from rpy2.robjects.numpy2ri import numpy2ri, pandas2ri
from rpy2.robjects.packages import STAP
# for rpy2 < 2.6.1
# from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage as STAP    

r_fct_string = """    
R_pls <- function(X_train, y_train, X_test){
  library(pls)

  X <- as.matrix(X_train)
  y <- as.matrix(y_train)
  xt <- as.matrix(X_test)

  tdata <- data.frame(y,X=I(X))
  REGmodel <- pls::pcr(y~X,scale=FALSE,data=tdata,validation="CV")
  B <- RMSEP(REGmodel)
  C <- B[[1]]
  q <- length(C)
  degs <- c(1:q)
  allvals <- C[degs%%2==0]
  allvals <- allvals[-1]
  comps <- which.min(allvals)
  ndata <- data.frame(X=I(xt))

  ypred_test <- as.data.frame(predict(REGmodel,ncomp=comps,newdata=ndata,se.fit=TRUE))
  ntdata <- data.frame(X=I(X))
  ypred_train <- as.data.frame(predict(REGmodel,ncomp=comps,newdata=ntdata,se.fit=TRUE))
  data_out <- list(ypred_test=ypred_test, ypred_train=ypred_train)

  return(data_out)
}
"""

r_pkg = STAP(r_fct_string, "r_pkg")

# CONVERT PYTHON NUMPY MATRICES TO R OBJECTS
r_X_train, r_y_train, r_X_test = map(numpy2ri, py_X_train, py_y_train, py_X_test)

# PASS R OBJECTS INTO FUNCTION (WILL NEED TO EXTRACT DFs FROM RESULT)
p_res = r_pkg.R_pls(r_X_train, r_y_train, r_X_test)

Alternatively, you can source the function as @agstudy shows here if function is saved in a separate .R script then call it like any Python function.

import rpy2.robjects as ro
robjects.r('''source('my_R_pls_func.r')''')

r_pls = ro.globalenv['R_pls']

# CONVERT PYTHON NUMPY MATRICES TO R OBJECTS
r_X_train, r_y_train, r_X_test = map(numpy2ri, py_X_train, py_y_train, py_X_test)

# PASS R OBJECTS INTO FUNCTION (WILL NEED TO EXTRACT DFs FROM RESULT)
p_res = r_pls(r_X_train, r_y_train, r_X_test)

Calling R script from python using rpy2

source is a r function, which runs a r source file. Therefore in rpy2, we have two ways to call it, either:

import rpy2.robjects as robjects
r = robjects.r
r['source']('script.R')

import rpy2.robjects as robjects
r = robjects.r
r.source('script.R')

r[r.source("script.R")] is a wrong way to do it.

Same idea may apply to the next line.

Accessing functions with a dot in their name (eg. as.vector) using rpy2

Get a reference to the function using the rpy2.robjects.r interface.

So, you could do something like:

as_vector = robjects.r("as.vector")
vect = as_vector(r_vect)

How to pass a R function as argument using rpy2 in a Python code

The original questioner had his question answered on the NMF project on Github. As described there, you define your new algorithm as a function, then use setNMFMethod to add the function to the registry of algorithms that perform Nonnegative Matrix Factorization, and then you can call it by name.

model as r function's parameter, when calling the r function using rpy2 from multithreads

If I understand it correctly, you:

have a function (classifier) written in R that requires a relatively large body of data to work (k nearest neighbors ?)
are loading that body of data using Python
would like to load the parameters /once/ and after that make as many calls to the classifier as required
plan passing the body of data as a parameter for the classifier

If following 4., copying is not always necessary but currently only if the data is numerical, or boolean, and the memory region is allocated by R.

However, I think that a simpler alternative for that situation is to have the body of data passed to R once for all (and copied if necessary) and use that converted object.

from rpy2.robjects.packages import importr
e1071 = importr('e1071')

from rpy2.robjects.conversion import py2ri

# your model's data are in 'm_data'
# here conversion is happening
r_m_data = py2ri(m_data)

for test_data in many_test_data:
    # r_m_data is already a pointer to an R data structure
    # (it was converted above - no further copying is made)
    res = e1071.knn(r_m_data, test_data)

This will correspond to what you describe as:

How can I load the model file just once, and the in-memory model object can be re-used
when the same r function being called though rpy2?