问题
I am training a bagFDA model using train()
function in r caret package, and save the model output as a .Rdata file. the input file is about 300k records with 26 variables, but the output .Rdata has a size of 3G. I simply run the following:
modelout <- train(x,y,method="bagFDA")
save(file= "myout.Rdata", modelout)
under a window system.
question: (1) why myout.Rdata
is so big? (2) how can I reduce the size of the file?
Thanks in advance!
JT
回答1:
In the trainControl
set returnData = FALSE
for starters, so your not creating an extra copy of the data in the model. My understanding is the with bagFDA you are creating a number of bootstraps, which essentially create the same number of copies of your data. lowering the B parameter, defaulted to 50, should shrink it as well
Also, check out this post:
Why is caret train taking up so much memory?
来源:https://stackoverflow.com/questions/42425513/huge-size-in-model-output-from-train-function-in-r-caret-package