Most efficient way to run regression models for multiple independent variables on the same list of 80 dependent outcomes?

爷,独闯天下 提交于 2021-01-28 01:48:59

问题


What is the most efficient way to run regression models for a list of 20 independent variables (e.g. genetic variants, each of these genetic variants will be tested alone) and 40 dependent variables? I am a beginner to R! I found a solution but it would work only if I had 1 independent variable. Not sure how I would go about if I had many (http://techxhum.dk/loop-multiple-variables/)

Thanks for your time.


回答1:


Here's a somewhat dense solution that uses the mfastLmCpp() function from the MESS package. It runs simple linear regression for multiple instruments and we just wrap it in an apply() call to get it to work with multiple dependent variables.

N <- 1000  # Number of observations
Nx <- 20   # Number of independent variables
Ny <- 80   # Number of dependent variables

# Simulate outcomes that are all standard Gaussians
Y <- matrix(rnorm(N*Ny), ncol=Ny)  
X <- matrix(rnorm(N*Nx), ncol=Nx)

# Now loop over each dependent variable and get a list of t test statistics
# for each independent variabel
apply(Y, 2, FUN=function(y) { MESS::mfastLmCpp(y=y, x=X) })

With the above setup it takes less than a second on my laptop.


Update: Added the functionality to the plr function in the MESS package.

devtools::install_github('ekstroem/MESS')
plr(Y, X)

et voila!



来源:https://stackoverflow.com/questions/59337879/most-efficient-way-to-run-regression-models-for-multiple-independent-variables-o

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!