Loading a .rds file in Pandas

后端 未结 3 787
梦谈多话
梦谈多话 2020-12-13 13:46

I have downloaded a file with format .rds, How can I load this with Pandas? It is supposed to be an R file but I haven\'t been able to find any info about how to load it.

相关标签:
3条回答
  • 2020-12-13 14:26

    You could use the rpy2 interface to Pandas, in the following manner:

    import rpy2.robjects as robjects
    from rpy2.robjects import pandas2ri
    pandas2ri.activate()
    
    readRDS = robjects.r['readRDS']
    df = readRDS('my_file.rds')
    df = pandas2ri.ri2py(df)
    # do something with the dataframe
    
    0 讨论(0)
  • 2020-12-13 14:33

    To follow up on @mgalardini's answer, in newer versions of rpy2 (version 3.0.4), the method that converts R dataframe to pandas dataframe has changed:

    >>> rpy2.__version__
    '3.0.4'
    >>> import rpy2.robjects as robjects
    >>> from rpy2.robjects import pandas2ri
    >>> readRDS = robjects.r['readRDS']
    >>> df = readRDS('my_file.rds')
    >>> df = pandas2ri.rpy2py_dataframe(df)
    
    0 讨论(0)
  • 2020-12-13 14:34

    If you would prefer not having to install R (rpy2 requires it), there is a new package "pyreadr" to read Rds and RData files very easily.

    It is a wrapper around the C library librdata, so it is very fast.

    You can install it easily with pip:

    pip install pyreadr
    

    Then you can read your rds file:

    import pyreadr
    
    result = pyreadr.read_r('/path/to/file.Rds') # also works for RData
    
    # done! 
    # result is a dictionary where keys are the name of objects and the values python
    # objects. In the case of Rds there is only one object with None as key
    df = result[None] # extract the pandas data frame 
    

    The repo is here: https://github.com/ofajardo/pyreadr

    Disclaimer: I am the developer of this package.

    0 讨论(0)
提交回复
热议问题