How Do Rpy2, Pyrserve and Pyper Compare

How do Rpy2, pyrserve and PypeR compare?

I know one of the 3 better than the others, but in the order given in the question:

rpy2:

  • C-level interface between Python and R (R running as an embedded process)
  • R objects exposed to Python without the need to copy the data over
  • Conversely, Python's numpy arrays can be exposed to R without making a copy
  • Low-level interface (close to the R C-API) and high-level interface (for convenience)
  • In-place modification for vectors and arrays possible
  • R callback functions can be implemented in Python
  • Possible to have anonymous R objects with a Python label
  • Python pickling possible
  • Full customization of R's behavior with its console (so possible to implement a full R GUI)
  • MSWindows with limited support

pyrserve:

  • native Python code (will/should/may work with CPython, Jython, IronPython)
  • use R's Rserve
  • advantages and inconveniences linked to remote computation and to RServe

pyper:

  • native Python code (will/should/may work with CPython, Jython, IronPython)
  • use of pipes to have Python communicate with R (with the advantages and inconveniences linked to it)

edit: Windows support for rpy2

What is the best interface from Python 3.1.1 to R?

edit: Rewrite to summarize the edits that accumulated over time.

The current rpy2 release (2.3.x series) has full support for Python 3.3, while
no claim is made about Python 3.0, 3.1, or 3.2.
At the time of writing the next rpy2 release (under development, 2.4.x series) is only supporting Python 3.3.

History of Python 3 support:

  • rpy2-2.1.0-dev / Python 3 branch in the repository - experimental support and application for a Google Summer of Code project consisting in porting rpy2 to Python 3 (under the Python umbrella)

  • application was accepted and thanks to Google's funding support for Python 3 slowly got into the main codebase (there was a fair bit of work still to be done after the GSoC - it made it for branch version_2.2.x).

Python interface for R Programming Language

As pointed out by @lgautier, there is already another answer on this subject. I leave my answer here as it adds the experience of approaching R as a novice, knowing Python first.


I use both Python and R and sympathise with your need as a newcomer to R.

Since any answer you get will be subjective, I summarise a few points from my experience:

  • I use rpy2 as my interface and find it is 'Pythonic', stable, predictable, and effective enough for my needs. I have not used the other packages so this is not a comment on them, rather on the merits of rpy2 itself.
  • BUT do not expect that there will be an easy way of using R in Python without learning both. I find that adding an interface between the two languages allows ease of coding when you know both, but a nightmare of debugging for someone who is deficient in one of the languages.

My advice:

  1. For most applications, Python has packages that allow you to do most of the things that you want to do in R, from data wrangling to plotting. Check out SciPy, NumPy, pandas, BioPython, matplotlib and other scientific packages, or even the full Anaconda or Enthought python distributions. This allows you to stay within the Python environment and provides you most of the power that you need.
  2. At the same time, you will want R's vast range of specialised packages, so spend some time learning it in an interactive environment. I found it almost impossible to master even basic R on the command line, but RStudio and the tutorials at Quick-R and Learn-R got me going very fast.

Once you know both, then you will do magic with rpy2 without the horrors of cross-language debugging.


New Resources

Update on 29 Jan 2015

This answer has proved popular and so I thought it would be useful to point out two more recent resources:

  • Ralph Heinkel gave a great talk on this subject at EuroPython 2014. The video on Combining the powerful worlds of Python and R is available on the EuroPython YouTube channel. Quoting him:

The triplet R, Rserve, and pyRserve allows the building up of a network bridge from Python to R: Now R-functions can be called from Python as if they were implemented in Python, and even complete R scripts can be executed through this connection.

  • It is now possible to combine R and Python using rmagic in IPython/Jupyter greatly easing the work of producing reproducible research and notebooks that combine both languages.

Using R package pmultinom with PyRserve

Here is one option with pyper as we have used it in production settings and it worked without any issues

from pyper import *
r = R(use_pandas=True)
num1 = 1
num2 = 2
num3 = 3
num4 = 4
num5 = 5
num6 = 6
num7 = 20000
vec1 = (.17649, .17542, .15276, .15184, .17227, .17122)

We don't need to create individual objects, it can be a list or tuple as in vec1. Just to demonstrate

r.assign("rnum1", num1)
r.assign("rnum2", num2)
r.assign("rnum3", num3)
r.assign("rnum4", num4)
r.assign("rnum5", num5)
r.assign("rnum6", num6)
r.assign("rnum7", num7)
r.assign("rvec1", vec1)

Create an expression

expr = "library(pmultinom); out <- pmultinom(lower = c(rnum1, rnum2, rnum3, rnum4, rnum5, rnum6), upper = rep.int(3630, 6), size = rnum7, probs = rvec1, method = 'exact')"

and evaluate the expression and get the output

r(expr)
r.get("out")
#0.95663799758361

-testing from R side directly

num1 = 1
num2 = 2
num3 = 3
num4 = 4
num5 = 5
num6 = 6
num7 = 20000
vec1 = c(.17649, .17542, .15276, .15184, .17227, .17122)

pmultinom(lower = c(num1, num2, num3, num4, num5, num6),
upper = rep.int(3630, 6), size = num7, probs = vec1,
method = 'exact')
#[1] 0.956638

PypeR fails if R using library(tm)

Solution: necessary to load dependent librarys first (R does this automatically, PypeR not). Ex.:

library(NLP)
library(tm)

library(RColorBrewer)
library(wordcloud)


Related Topics



Leave a reply



Submit