How do Rpy2, pyrserve and PypeR compare?
I know one of the 3 better than the others, but in the order given in the question:
rpy2:
- C-level interface between Python and R (R running as an embedded process)
- R objects exposed to Python without the need to copy the data over
- Conversely, Python's numpy arrays can be exposed to R without making a copy
- Low-level interface (close to the R C-API) and high-level interface (for convenience)
- In-place modification for vectors and arrays possible
- R callback functions can be implemented in Python
- Possible to have anonymous R objects with a Python label
- Python pickling possible
- Full customization of R's behavior with its console (so possible to implement a full R GUI)
- MSWindows with limited support
pyrserve:
- native Python code (will/should/may work with CPython, Jython, IronPython)
- use R's Rserve
- advantages and inconveniences linked to remote computation and to RServe
pyper:
- native Python code (will/should/may work with CPython, Jython, IronPython)
- use of pipes to have Python communicate with R (with the advantages and inconveniences linked to it)
edit: Windows support for rpy2
What is the best interface from Python 3.1.1 to R?
edit: Rewrite to summarize the edits that accumulated over time.
The current rpy2 release (2.3.x series) has full support for Python 3.3, while
no claim is made about Python 3.0, 3.1, or 3.2.
At the time of writing the next rpy2 release (under development, 2.4.x series) is only supporting Python 3.3.
History of Python 3 support:
rpy2-2.1.0-dev / Python 3 branch in the repository - experimental support and application for a Google Summer of Code project consisting in porting rpy2 to Python 3 (under the Python umbrella)
application was accepted and thanks to Google's funding support for Python 3 slowly got into the main codebase (there was a fair bit of work still to be done after the GSoC - it made it for branch version_2.2.x).
Python interface for R Programming Language
As pointed out by @lgautier, there is already another answer on this subject. I leave my answer here as it adds the experience of approaching R as a novice, knowing Python first.
I use both Python and R and sympathise with your need as a newcomer to R.
Since any answer you get will be subjective, I summarise a few points from my experience:
- I use rpy2 as my interface and find it is 'Pythonic', stable, predictable, and effective enough for my needs. I have not used the other packages so this is not a comment on them, rather on the merits of rpy2 itself.
- BUT do not expect that there will be an easy way of using R in Python without learning both. I find that adding an interface between the two languages allows ease of coding when you know both, but a nightmare of debugging for someone who is deficient in one of the languages.
My advice:
- For most applications, Python has packages that allow you to do most of the things that you want to do in R, from data wrangling to plotting. Check out SciPy, NumPy, pandas, BioPython, matplotlib and other scientific packages, or even the full Anaconda or Enthought python distributions. This allows you to stay within the Python environment and provides you most of the power that you need.
- At the same time, you will want R's vast range of specialised packages, so spend some time learning it in an interactive environment. I found it almost impossible to master even basic R on the command line, but RStudio and the tutorials at Quick-R and Learn-R got me going very fast.
Once you know both, then you will do magic with rpy2 without the horrors of cross-language debugging.
New Resources
Update on 29 Jan 2015
This answer has proved popular and so I thought it would be useful to point out two more recent resources:
- Ralph Heinkel gave a great talk on this subject at EuroPython 2014. The video on Combining the powerful worlds of Python and R is available on the EuroPython YouTube channel. Quoting him:
The triplet R, Rserve, and pyRserve allows the building up of a network bridge from Python to R: Now R-functions can be called from Python as if they were implemented in Python, and even complete R scripts can be executed through this connection.
- It is now possible to combine R and Python using
rmagic
inIPython/Jupyter
greatly easing the work of producing reproducible research and notebooks that combine both languages.
Using R package pmultinom with PyRserve
Here is one option with pyper
as we have used it in production settings and it worked without any issues
from pyper import *
r = R(use_pandas=True)
num1 = 1
num2 = 2
num3 = 3
num4 = 4
num5 = 5
num6 = 6
num7 = 20000
vec1 = (.17649, .17542, .15276, .15184, .17227, .17122)
We don't need to create individual objects, it can be a list or tuple as in vec1
. Just to demonstrate
r.assign("rnum1", num1)
r.assign("rnum2", num2)
r.assign("rnum3", num3)
r.assign("rnum4", num4)
r.assign("rnum5", num5)
r.assign("rnum6", num6)
r.assign("rnum7", num7)
r.assign("rvec1", vec1)
Create an expression
expr = "library(pmultinom); out <- pmultinom(lower = c(rnum1, rnum2, rnum3, rnum4, rnum5, rnum6), upper = rep.int(3630, 6), size = rnum7, probs = rvec1, method = 'exact')"
and evaluate the expression and get the output
r(expr)
r.get("out")
#0.95663799758361
-testing from R side directly
num1 = 1
num2 = 2
num3 = 3
num4 = 4
num5 = 5
num6 = 6
num7 = 20000
vec1 = c(.17649, .17542, .15276, .15184, .17227, .17122)
pmultinom(lower = c(num1, num2, num3, num4, num5, num6),
upper = rep.int(3630, 6), size = num7, probs = vec1,
method = 'exact')
#[1] 0.956638
PypeR fails if R using library(tm)
Solution: necessary to load dependent librarys first (R does this automatically, PypeR not). Ex.:
library(NLP)
library(tm)
library(RColorBrewer)
library(wordcloud)
Related Topics
How to Install and Import Python Modules at Runtime
Placing Custom Images in a Plot Window--As Custom Data Markers or to Annotate Those Markers
"Getaddrinfo Failed", What Does That Mean
Running Selenium Webdriver with a Proxy in Python
Why Is Tkinter Entry's Get Function Returning Nothing
Does Python Support Multithreading? Can It Speed Up Execution Time
Check If a String Matches an Ip Address Pattern in Python
How to Get Current Available Gpus in Tensorflow
Tkinter: "Python May Not Be Configured for Tk"
Python String.Replace Regular Expression
How to Build 32Bit Python 2.6 on 64Bit Linux
Loading .Rdata Files into Python
How to Dereference Variable Id'S
Quick and Easy File Dialog in Python
Unnest (Explode) a Pandas Series