Tricks to manage the available memory in an R session

前端 未结 27 1949
情深已故
情深已故 2020-11-22 01:23

What tricks do people use to manage the available memory of an interactive R session? I use the functions below [based on postings by Petr Pikal and David Hinds to the r-he

27条回答
  •  没有蜡笔的小新
    2020-11-22 01:57

    I make aggressive use of the subset parameter with selection of only the required variables when passing dataframes to the data= argument of regression functions. It does result in some errors if I forget to add variables to both the formula and the select= vector, but it still saves a lot of time due to decreased copying of objects and reduces the memory footprint significantly. Say I have 4 million records with 110 variables (and I do.) Example:

    # library(rms); library(Hmisc) for the cph,and rcs functions
    Mayo.PrCr.rbc.mdl <- 
    cph(formula = Surv(surv.yr, death) ~ age + Sex + nsmkr + rcs(Mayo, 4) + 
                                         rcs(PrCr.rat, 3) +  rbc.cat * Sex, 
         data = subset(set1HLI,  gdlab2 & HIVfinal == "Negative", 
                               select = c("surv.yr", "death", "PrCr.rat", "Mayo", 
                                          "age", "Sex", "nsmkr", "rbc.cat")
       )            )
    

    By way of setting context and the strategy: the gdlab2 variable is a logical vector that was constructed for subjects in a dataset that had all normal or almost normal values for a bunch of laboratory tests and HIVfinal was a character vector that summarized preliminary and confirmatory testing for HIV.

提交回复
热议问题