Tricks to manage the available memory in an R session

前端 未结 27 1945
情深已故
情深已故 2020-11-22 01:23

What tricks do people use to manage the available memory of an interactive R session? I use the functions below [based on postings by Petr Pikal and David Hinds to the r-he

27条回答
  •  余生分开走
    2020-11-22 01:58

    For both speed and memory purposes, when building a large data frame via some complex series of steps, I'll periodically flush it (the in-progress data set being built) to disk, appending to anything that came before, and then restart it. This way the intermediate steps are only working on smallish data frames (which is good as, e.g., rbind slows down considerably with larger objects). The entire data set can be read back in at the end of the process, when all the intermediate objects have been removed.

    dfinal <- NULL
    first <- TRUE
    tempfile <- "dfinal_temp.csv"
    for( i in bigloop ) {
        if( !i %% 10000 ) { 
            print( i, "; flushing to disk..." )
            write.table( dfinal, file=tempfile, append=!first, col.names=first )
            first <- FALSE
            dfinal <- NULL   # nuke it
        }
    
        # ... complex operations here that add data to 'dfinal' data frame  
    }
    print( "Loop done; flushing to disk and re-reading entire data set..." )
    write.table( dfinal, file=tempfile, append=TRUE, col.names=FALSE )
    dfinal <- read.table( tempfile )
    

提交回复
热议问题