How to Setup Environment Variable R_User to Use Rpy2 in Python

How to correctly set up rpy2?

You could use R interface integration with Python through a conda environment or a docker image. While the Docker approach is easier to set up, the conda approach is mainly because it allows you to manage different environments, in this case one with R and Python.

1. Using rpy2 with Docker Image

After installing Docker Desktop on your system, see this link. You could use the datasciencenotebook image from Jupyter. Just type on your terminal

docker run -it -e GRANT_SUDO=yes --user root --rm -p 8888:8888 -p 4040:4040 -v D:/:/home/jovyan/work jupyter/datascience-notebook

if it's the first time running this command it will pull first the docker image. Notice that we're mounting the local directory D:/ as a volume to the docker container. To allow this, enable file sharing inside Docker Desktop Settings, see the image below

Sample Image
Then, in a Jupyter Notebook cell just type import rpy2, rpy2 comes by default with this image.

Sample Image

2. Using rpy2 with Anaconda Environment

After succesfully installing Anaconda distribution, open the Anaconda prompt and create a new conda environment, in this case I'm calling it rpy2 environment.

conda create -n rpy2-env r-essentials r-base python=3.7

Notice that I'm including R and Python 3.7 for this environment. At the moment of writing, rpy2 is not yet compatible with the latest version of python. Then, activate your environment and install rpy2.

Sample Image

conda activate rpy2-env
conda install -c r rpy2

Now, you can use rpy2 by typing python or ipython on the terminal or through a Jupyter Notebook.

Sample Image

import rpy2.situation
for row in rpy2.situation.iter_info():
print(row)

3. Installing R packages (Optional)

Additionally, if you need to install R packages, you could type in the terminal

R -e install.packages("package_name")

or inside a Jupyter Notebook

import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector

# Choosing a CRAN Mirror
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)

# Installing required packages
packages = ('ggplot', 'stats')
utils.install_packages(StrVector(packages))

How to set environment variable R_User? And how to get module winreg?

It seems that you have to tweek your environment variables manually. To do this, go to the environment variables (Control Panel > System Settings > Advanced System Settings > (Advanced Tab) Environment Variables) and follow the steps from the top answer here: How to setup environment variable R_user to use rpy2 in python . (Make sure you add to the Path).

Concerning the RRuntimeError you mentioned in the comments, it's hard to tell without further info, but it seems like your code tries to open a file (or install R libraries). Check out these links:

  • Error in file(file, "rt") : cannot open the connection
  • assign variables from python to R using r.assign and then use read.table
  • https://bitbucket.org/rpy2/rpy2/issues/399/rruntimeerror-error-in-file-file-rt-cannot

How to set a custom R installation for using rpy2 in Jupyter?

There are two approaches to solve this, a local (for individual Jupyter notebooks) and a global one (for the kernel itself). Both are related to setting the R_HOME environment variable.

Local (source):
Before calling %load_ext rpy2.ipython in your Jupyter notebook, run:

import os
os.environ['R_HOME'] = '/home/your/anaconda3/envs/myenv/lib/R' #path to your R installation

Global:
Find your kernel directory via: jupyter kernelspec list and edit the file kernel.json. Update the JSON by adding:
"env": {"R_HOME":"/home/your/anaconda3/envs/my-env-name/lib/R"}, then restart your kernel (you might have to restart Jupyter as well).

Update (messed up LD_LIBRARY_PATH)

Recently, I tried running rpy2 in jupyter again after setting up a new environment using conda:

conda config --add channels conda-forge
conda config --set channel_priority strict
conda create -n myenv python=3.7
conda activate myenv
conda install r-essentials pandas rpy2

And this time I ran into the following issue when trying to either %load_ext rpy2.ipython (Jupyter) or simply import rpy2.robjects (any script):

>>> import rpy2.robjects                                            
Warning message:
package ‘methods’ was built under R version 3.6.3
Error: package or namespace load failed for ‘stats’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/stats/libs/stats.so':
/home/your/anaconda3/envs/myenv/lib/R/library/stats/libs/stats.so: undefined symbol: MARK_NOT_MUTABLE
During startup - Warning messages:
1: package ‘datasets’ was built under R version 3.6.3
2: package ‘utils’ was built under R version 3.6.3
3: package ‘grDevices’ was built under R version 3.6.3
4: package ‘graphics’ was built under R version 3.6.3
5: package ‘stats’ was built under R version 3.6.3
6: package ‘stats’ in options("defaultPackages") was not found
R[write to console]: Error: package or namespace load failed for ‘tools’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so':
/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so: undefined symbol: R_NewPreciousMSet

R[write to console]: Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so':
/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so: undefined symbol: R_NewPreciousMSet

R[write to console]: In addition:
R[write to console]: Warning message:

R[write to console]: package ‘tools’ was built under R version 3.6.3

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/robjects/__init__.py", line 20, in <module>
import rpy2.robjects.functions
File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/robjects/functions.py", line 12, in <module>
from rpy2.robjects import help
File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/robjects/help.py", line 43, in <module>
tools_ns = _get_namespace(StrSexpVector(('tools',)))
File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/rinterface_lib/conversion.py", line 44, in _
cdata = function(*args, **kwargs)
File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/rinterface.py", line 621, in __call__
raise embedded.RRuntimeError(_rinterface._geterrmessage())
rpy2.rinterface_lib.embedded.RRuntimeError: Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so':
/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so: undefined symbol: R_NewPreciousMSet

The issue seemed to have been a screwed up R "situation" (check via %run -m rpy2.situation in Jupyter or simply python -m rpy2.situation on the command line), which had R's additions to LD_LIBRARY_PATH: pointing to and old, globally installed R version.

I had to manually unset the LD_LIBRARY_PATH to solve this issue. This path can be set / unset analogously to R_HOME.

PS: I found R_HOME and LD_LIBRARY_PATH set in my .bashrc to custom (from source) R installation. Which confused the Jupyter kernel obviously. Not smart ;)

PPS: rpy2.situation still tells me that there is a Warning: The environment variable R_HOME differs from the default R in the PATH.:

Looking for R's HOME:
Environment variable R_HOME: /home/your/anaconda3/envs/myenv/lib/R
Calling `R RHOME`: /home/your/anaconda3/envs/jupyter-env/lib/R
Environment variable R_LIBS_USER: None
Warning: The environment variable R_HOME differs from the default R in the PATH.

Which worries me that R actually defaults to the R installed for the Jupyter installation. So if anybody has comments about this, I would be grateful.

rpy2: load R version installed in conda environment, not the one in the system

If I remember correctly, the link to the R installation to use is made during the installation of rpy2.

To use the specific R installation you mentionned I guess you can do the following steps:

  • uninstalling rpy2
  • adding the bin folder of the R installation targeted in the PATH environnement variable :

    export PATH=${PATH}:/path/to/conda/R-3.6.1/bin/
  • setting the R_HOME variable environnement to the folder of the R installation targeted:
    export R_HOME=/path/to/conda/R-3.6.1/
  • installing rpy2 again.

Rpy2 error wac-a-mole: R_USER not defined

You need to set the R_USER environment variable, e.g. to the username of the Windows account you use. See also this quote from this link:

1) Add the path to R.dll to my PATH variable (I went to the 32-bit directory) 2) Add an environment variable R_HOME (C:\Program Files\R\R-2.12.1 for me) 3) Add an environment variable R_USER (simply my username in Windows).



Related Topics



Leave a reply



Submit