Is there a way to guess the size of data.frame based on rows, columns and variable types?

只谈情不闲聊 提交于 2019-12-04 09:01:05

You can simulate an object and compute an estimation of the memory that is being used to store it as an R object using object.size:

m <- matrix(1,nrow=1e5,ncol=150)
m <- as.data.frame(m)
m[,1:20] <- sapply(m[,1:20],as.character)
m[,29:30] <- sapply(m[,29:30],as.factor)
object.size(m)
120017224 bytes
print(object.size(m),units="Gb")
0.1 Gb

Check out pryr package as well. It has object_size which may be slightly better for you. From the advanced R

This function is better than the built-in object.size() because it accounts for shared elements within an object and includes the size of environments.

You also need to account for the size of attributes as well as the column types etc.

object.size(attributes(m))

You could create dummy variables that store examples of the data you will be storing in the dataframe.

Then use object.size() to find their size and multiply with the rows and columns accordingly.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!