R reading a huge csv

后端 未结 5 819
慢半拍i
慢半拍i 2020-12-23 23:12

I have a huge csv file. Its size is around 9 gb. I have 16 gb of ram. I followed the advises from the page and implemented them below.

If you get the error          


        
5条回答
  •  庸人自扰
    2020-12-23 23:42

    Make sure you're using 64-bit R, not just 64-bit Windows, so that you can increase your RAM allocation to all 16 GB.

    In addition, you can read in the file in chunks:

    file_in    <- file("in.csv","r")
    chunk_size <- 100000 # choose the best size for you
    x          <- readLines(file_in, n=chunk_size)
    

    You can use data.table to handle reading and manipulating large files more efficiently:

    require(data.table)
    fread("in.csv", header = T)
    

    If needed, you can leverage storage memory with ff:

    library("ff")
    x <- read.csv.ffdf(file="file.csv", header=TRUE, VERBOSE=TRUE, 
                       first.rows=10000, next.rows=50000, colClasses=NA)
    

提交回复
热议问题