rpy2 does not convert back to pandas
In R, when calling source()
by default on a script without named functions, the returned object is a list of two named components, $value
and $visible
, where:
$value
is the last displayed or defined object which in your case is thefar_df
data frame (which in Rdata.frame
is a class object extendinglist
type);$visible
is a boolean vector indicating if last object was displayed or not which in your case isTRUE
. This would beFALSE
had you ended script atfar_df <- tidy.sts(surveil_ts_4_far)
.
In fact, your Python error confirms this output indicatating a list of [ListSexpVector, BoolSexpVector]
.
Therefore, since you only want the first item, index for first item accordingly by number or name.
r_raw = ro.r['source']('farrington.R') # IN R: r_raw <- source('farrington.R')
r_df = r_raw[0] # IN R: r_df <- r_raw[1]
r_df = r_raw[r_raw.names.index('value')] # IN R: r_df <- r_raw$value
with localconverter(ro.default_converter + pandas2ri.converter):
pd_from_r_df = ro.conversion.rpy2py(r_df)
Use rpy2 with pandas dataframe
You are almost there. In order to run R functions, you need to convert the pandas Dataframe to R Dataframe. Once we have the R object we can call the functions as shown below.
import rpy2
from rpy2.robjects.packages import importr # import R's "base" package
base = importr('base')
from rpy2.robjects import pandas2ri # install any dependency package if you get error like "module not found"
pandas2ri.activate()
# Create pandas df
df = pd.DataFrame( np.random.randn(5,2), # 5 rows, 2 columns
columns = ["A","B"], # name of columns
index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )
# Convert pandas to r
r_df = pandas2ri.py2ri(df)
type(r_df)
#calling function under base package
print(base.summary(r_df))
Converting a Pandas DataFrame to R dataframe using Rpy2
Unfortunately, this is going to be difficult because the Python -> R transformation is better than it used to be, but isn't perfect, and is still hard on Windows currently, which it looks like you're using.
This is a bit of a hack, but as a work-around you might try setting the name and time variables while you are assigning the pd.DataFrame before you convert the DataFrame into R.
Once it's in R, you'll need to use R functions to operate on the data frame, rather than your python functions---even your getter and setter will need to be passed into the R environment in a way that looks more like this:
myfunct = robjects.r('''
f <- function(r, verbose=FALSE) {
if (verbose) {
cat("I am calling f().\n")
}
2 * pi * r
}
f(3)
''')
from here.
But just to check that your DataFrame is being converted appropriately in the first place, you might start your debugging by running this:
import pandas as pd
import numpy as np
import pandas.rpy.common as com
from datetime import datetime
n = 10
df = pd.DataFrame({
"timestamp": [datetime.now() for t in range(n)],
"value": np.random.uniform(-1, 1, n)
})
r_dataframe = com.convert_to_r_dataframe(df)
print(r_dataframe)
Is that producing something that looks like an R print statement of a dataframe, like so
>>> timestamp value
0 2014-06-03 15:02:20 -0.36672....
1 2014-06-03 15:02:20 -0.89136....
2 2014-06-03 15:02:20 0.509215....
3 2014-06-03 15:02:20 0.862909....
4 2014-06-03 15:02:20 0.389879....
5 2014-06-03 15:02:20 -0.80607....
6 2014-06-03 15:02:20 -0.97116....
7 2014-06-03 15:02:20 0.376419....
8 2014-06-03 15:02:20 0.848243....
9 2014-06-03 15:02:20 0.446798....
Example peeled from here and here.
rpy2 How to assign R dataframe to value/values
One way to achieve what you want is:
r_df[r_df.colnames.index('col1')] = base.as_Date(r_df.rx2('col1'), '%Y-%m-%d')
Why is something like r_df['col1']
not implemented? Because R can be peculiar, and a lot of choices in rpy2
prefer a slight annoyance to a source of very hard-to-debug issues. Here this is because column names in an R data frame are not enforced to be unique, and getting an item by name will return the first one with that name. For example:
import rpy2.robjects as ro
dataf = ro.r('data.frame(x=1:3, x=4:6, check.names=FALSE)')
print(dataf)
# x x
# 1 1 4
# 2 2 5
# 3 3 6
dataf.rx2('x')
# R object with classes: ('RTYPES.INTSXP',) mapped to:
# [1, 2, 3]
The Python method index
is present in Python list
, tuple
, etc... and is documented to return the first matching index.
Related Topics
Is It Ok to Use Dashes in Python Files When Trying to Import Them
Replace Characters Not Working in Python
SQL Join or R's Merge() Function in Numpy
Add Custom CSS Styling to Model Form Django
Closest Equivalent of a Factor Variable in Python Pandas
Typeerror: Can't Use a String Pattern on a Bytes-Like Object in Re.Findall()
Convert Decimal Mark When Reading Numbers as Input
Test If Executable Exists in Python
What Does "While True" Mean in Python
Saving Interactive Matplotlib Figures
How to Use Selenium to Automate Chase Site Login
Tensorflow Different Ways to Export and Run Graph in C++
Best Way to Set Entry Background Color in Python Gtk3 and Set Back to Default
Numpy/Scipy Equivalent of R Ecdf(X)(X) Function
Matplotlib: Annotating a 3D Scatter Plot