T-test with grouping variable

不羁岁月 提交于 2019-12-25 05:34:08

问题


I've got a data frame with 36 variables and 74 observations. I'd like to make a two sample paired ttest of 35 variables by 1 grouping variable (with two levels).

For example: the data frame contains "age" "body weight" and "group" variables. Now I suppose I can do the ttest for each variable with this code:

t.test(age~group)

But, is there a way to test all the 35 variables with one code, and not one by one?


回答1:


An example data frame:

dat <- data.frame(age = rnorm(10, 30), body = rnorm(10, 30), 
                  weight = rnorm(10, 30), group = gl(2,5))

You can use lapply:

lapply(dat[1:3], function(x) 
                   t.test(x ~ dat$group, paired = TRUE, na.action = na.pass))

In the command above, 1:3 represents the numbers of the columns including the variables. The argument paired = TRUE is necessary to perform a paired t-test.




回答2:


Sven has provided you with a great way of implementing what you wanted to have implemented. I, however, want to warn you about the statistical aspect of what you are doing.

Recall that if you are using the standard confidence level of 0.05, this means that for each t-test performed, you have a 5% chance of committing Type 1 error (incorrectly rejecting the null hypothesis.) By the laws of probability, running 35 individual t-tests compounds your probability of committing type 1 error by a factor of 35, or more exactly:

Pr(Type 1 Error) = 1 - (0.95)^35 = 0.834

Meaning that you have about an 83.4% chance of falsely rejecting a null hypothesis. Basically what this means is that, by running so many T-tests, there is a very high probability that at least one of your T-tests is going to provide an incorrect result.

Just FYI.



来源:https://stackoverflow.com/questions/20553995/t-test-with-grouping-variable

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!