Bootstrapping multiple columns in data.table in a scalable fashion R

拜拜、爱过 提交于 2020-01-25 20:49:05

问题


This is a follow up question to this one. In the original question the OP wanted to perform bootstrap on two columns x1 and x2 that are fixed:

set.seed(1000)
data <- as.data.table(list(x1 = runif(200), x2 = runif(200), group = runif(200)>0.5))
stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2))]}
data[, list(list(boot(.SD, stat, R = 10))), by = group]$V1

However, I think this problem can be nicely extended to handle any number of columns by treating them as groups. For instance, lets use the iris dataset. Say I want to calculate bootstrap mean for all four dimensions for each species. I can use melt to flip the data and then use the Species, variable combination to get the mean in one go - I think this approach will scale well.

data(iris)
iris = data.table(iris)
iris[,mean(Sepal.Length),by=Species]
iris[,ID:=.N,]
iris_deep = melt(iris
                 ,id.vars = c("ID","Species")
                 ,measure.vars = c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width"))
#define a mean bootstrap function
stat <- function(x, i) {x[i, m=mean(value),]}
iris_deep[, list(list(boot(.SD, stat, R = 100))), by = list(Species,variable)]$V1

Here is my attempt at doing this. However the bootstrapping part does not seem to be working. As R throws the following error:

Error in mean(value) : object 'value' not found

Can someone please take a crack at this?


回答1:


I tried this (with added braces enclosing m=mean(value)) and it appears to work:

stat <- function(x, i) {x[i, (m=mean(value))]}


来源:https://stackoverflow.com/questions/38989932/bootstrapping-multiple-columns-in-data-table-in-a-scalable-fashion-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!