How to Use Rpy2 to Save a Pandas Dataframe to an .Rdata File

Can I use rpy2 to save a pandas dataframe to an .Rdata file?

You can use rpy2 to do this. Once you have the data in a panda, you have to transmit it to R. This link provides an experimental interface between Python Pandas and R data.frames. A code example copied from the website:

from pandas import DataFrame
import pandas.rpy.common as com

df = DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C':[7,8,9]},
index=["one", "two", "three"])
r_dataframe = com.convert_to_r_dataframe(df)

print type(r_dataframe)
<class 'rpy2.robjects.vectors.DataFrame'>

print r_dataframe
A B C
one 1 4 7
two 2 5 8
three 3 6 9

python dataframe write to R data format

The data.frame transfer from Python to R could be accomplished with the feather format. Via this link you can find more information.

Quick example.

Export in Python:

import feather
path = 'my_data.feather'
feather.write_dataframe(df, path)

Import in R:

library(feather)
path <- "my_data.feather"
df <- read_feather(path)

In this case you'll have the data in R as a data.frame. You can then decide to write it to an RData file.

save(df, file = 'my_data.RData')

How can I import a data frame from R saved as RData to pandas?

Generalized conversion of data frames turns out to be an expensive operation as a copying is required for some types of columns. A local conversion rule could be better:

from rpy2.robjects import pandas2ri
from rpy2.robjects import default_converter
from rpy2.robjects.conversion import localconverter

print(r.data('iris'))
with localconverter(default_converter + pandas2ri.converter) as cv:
pd_iris = r('iris')
# this is a pandas DataFrame
pd_iris

Otherwise, the following is "just working" on this end (Linux, head of branch default for rpy2):

import pandas as pd
from rpy2.robjects import r
from rpy2.robjects import pandas2ri
pandas2ri.activate()

pd_iris = r('iris')
pd_iris

If it does not for you, there might be an issue with rpy2 on Windows (yet an other one - rpy2 is not fully supported on Windows).

Saving dataframes as .Rdata files using a for loop

You need to make the name of your dataset outside of save() as it evaluates the first arguments as character or symbols and functions would not be evaluated.

Also, you need to get rid of quotations around the second paste0 and close your parentheses for save() which look like a typos.

for (i in 1:11){
dbname <- paste0("Dataset_",i)
save(dbname, file = paste0("Hypothesis1/Dataset",i, ".RData"))
}

Better approach would be using apply functions, listing your datasets name using ls, and passing them into save as character using list = ... argument.

lapply(ls(pattern="Dataset[0-9]+"), function(x) save(list = x, file = paste0("Hypothesis1/",x,".RData")))

How to load R's .rdata files into Python?

There is a new python package pyreadr that makes very easy import RData and Rds files into python:

import pyreadr

result = pyreadr.read_r('mtcars_nms.rdata')

mtcars = result['mtcars_nms']

It does not depend on having R or other external dependencies installed.
It is a wrapper around the C library librdata, therefore it is very fast.

You can install it very easily with pip:

pip install pyreadr

The repo is here: https://github.com/ofajardo/pyreadr

Disclaimer: I am the developer.

save RData workspace in Python using rpy2

If staying with rpy2 to load the saved objects, you could just use Python's pickling (the equivalent of R's load/save - see http://pymotw.com/2/pickle/):
http://rpy.sourceforge.net/rpy2/doc-2.4/html/robjects_serialization.html

Otherwise try:

from rpy2.robjects.packages import importr
base = importr('base')
base.save_image(<arguments here...>)


Related Topics



Leave a reply



Submit