I have downloaded a file with format .rds, How can I load this with Pandas? It is supposed to be an R file but I haven\'t been able to find any info about how to load it.
You could use the rpy2 interface to Pandas, in the following manner:
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
pandas2ri.activate()
readRDS = robjects.r['readRDS']
df = readRDS('my_file.rds')
df = pandas2ri.ri2py(df)
# do something with the dataframe
To follow up on @mgalardini's answer, in newer versions of rpy2 (version 3.0.4), the method that converts R dataframe to pandas dataframe has changed:
>>> rpy2.__version__
'3.0.4'
>>> import rpy2.robjects as robjects
>>> from rpy2.robjects import pandas2ri
>>> readRDS = robjects.r['readRDS']
>>> df = readRDS('my_file.rds')
>>> df = pandas2ri.rpy2py_dataframe(df)
If you would prefer not having to install R (rpy2 requires it), there is a new package "pyreadr" to read Rds and RData files very easily.
It is a wrapper around the C library librdata, so it is very fast.
You can install it easily with pip:
pip install pyreadr
Then you can read your rds file:
import pyreadr
result = pyreadr.read_r('/path/to/file.Rds') # also works for RData
# done!
# result is a dictionary where keys are the name of objects and the values python
# objects. In the case of Rds there is only one object with None as key
df = result[None] # extract the pandas data frame
The repo is here: https://github.com/ofajardo/pyreadr
Disclaimer: I am the developer of this package.