Saving a data frame as a binary file

后端 未结 2 1649
自闭症患者
自闭症患者 2020-12-15 08:30

I would like to save a whole bunch of relatively large data frames while minimizing the space that the files take up. When opening the files, I need to be able to control wh

相关标签:
2条回答
  • 2020-12-15 08:53

    Your best bet is to use rda files. You can use the save() and load() commands to write and read:

    set.seed(101)
    a = data.frame(x1=runif(10), x2=runif(10), x3=runif(10))
    
    save(a, file="test.rda")
    load("test.rda")
    

    Edit: For completeness, just to cover what Harlan's suggestion might look like (i.e. wrapping the load command to return the data frame):

    loadx <- function(x, file) {
      load(file)
      return(x)
    }  
    
    loadx(a, "test.rda")
    

    Alternatively, have a look at the hdf5, RNetCDF and ncdf packages. I've experimented with the hdf5 package in the past; this uses the NCSA HDF5 library. It's very simple:

    hdf5save(fileout, ...)
    hdf5load(file, load = TRUE, verbosity = 0, tidy = FALSE)
    

    A last option is to use binary file connections, but that won't work well in your case because readBin and writeBin only support vectors:

    Here's a trivial example. First write some data with "w" and append "b" to the connection:

    zz <- file("testbin", "wb")
    writeBin(1:10, zz)
    close(zz)
    

    Then read the data with "r" and append "b" to the connection:

    zz <- file("testbin", "rb")
    readBin(zz, integer(), 4)
    close(zz)
    
    0 讨论(0)
  • 2020-12-15 09:05

    You may have a look at saveRDS and readRDS. They are functions for serialization.

    x = data.frame(x1=runif(10), x2=runif(10), x3=runif(10))
    
    saveRDS(x, file="myDataFile.rds")
    x <- readRDS(file="myDataFile.rds")
    
    0 讨论(0)
提交回复
热议问题