I have a bunch of .RData time-series files and would like to load them directly into Python without first converting the files to some other extension (such as .csv). Any id
If you are using Jupyter notebook, you need to do 2 steps:
Step 1: go to http://www.lfd.uci.edu/~gohlke/pythonlibs/#rpy2 and download Python interface to the R language (embedded R) in my case I will use rpy2-2.8.6-cp36-cp36m-win_amd64.whl
Put this file in the same working directory you are currently in.
Step 2: Go to your Jupyter notebook and write the following commands
# This is to install rpy2 library in Anaconda
!pip install rpy2-2.8.6-cp36-cp36m-win_amd64.whl
and then
# This is important if you will be using rpy2
import os
os.environ['R_USER'] = 'D:\Anaconda3\Lib\site-packages\rpy2'
and then
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
pandas2ri.activate()
This should allow you to use R functions in python. Now you have to import the readRDS
as follow
readRDS = robjects.r['readRDS']
df = readRDS('Data1.rds')
df = pandas2ri.ri2py(df)
df.head()
Congratulations! now you have the Dataframe you wanted
However, I advise you to save it in pickle file for later time usage in python as
df.to_pickle('Data1')
So next time you may simply use it by
df1=pd.read_pickle('Data1')