I have a bunch of .RData time-series files and would like to load them directly into Python without first converting the files to some other extension (such as .csv). Any id
Well, I couple years ago I had the same problem as you. I wanted to read .RData files from a library that I was developing. I considered using RPy2, but that would have forced me to release my library with a GPL license, which I did not want to do.
"pyreadr" didn't even exist then. Also, the datasets which I wanted to load were not in a standardized format as a data.frame.
I came to this question and read Spacedman answer. In particular, I saw the line
So any other implementation in any other language is hard++.
as a challenge, and implemented the package rdata in a couple of days as a result. This is a very small pure Python implementation of a .RData parser and converter, able to suit my needs until now. The steps of parsing the original objects and converting to apropriate Python objects are separated, so that users could use a different conversion if they want. Moreover, users can add constructors for custom R classes.
This is an usage example:
>>> import rdata
>>> parsed = rdata.parser.parse_file(rdata.TESTDATA_PATH / "test_vector.rda")
>>> converted = rdata.conversion.convert(parsed)
>>> converted
{'test_vector': array([1., 2., 3.])}
As I said, I developed this package and have been used since without problems, but I did not bother to give it visibility as I did not document it properly. This has recently changed and now the documentation is mostly ok, so here it is for anyone interested:
https://github.com/vnmabus/rdata