Using shapiro.test on multiple columns in a data frame

前端未结

关注

 3  2022

伪装坚强ぢ 2020-12-28 10:08

It seems like a pretty simple question, but I can\'t find the answer.

I have a dataframe (lets call it df), containing n=100 columns (C1, <

3条回答

悲哀的现实 (楼主)

2020-12-28 10:42
Not that I think this is a sensible approach to data analysis, but the underlying issue of applying a function to the columns of a data frame is a general task that can easily be achieved using one of sapply() or lapply() (or even apply(), but for data frames, one of the two earlier-mentioned functions would be best).

Here is an example, using some dummy data:
```
set.seed(42)
df <- data.frame(Gaussian = rnorm(50), Poisson = rpois(50, 2), 
                 Uniform = runif(50))
```
Now apply the shapiro.test() function. We capture the output in a list (given the object returned by this function) so we will use lapply().
```
lshap <- lapply(df, shapiro.test)
lshap[[1]] ## look at the first column results

R> lshap[[1]]

    Shapiro-Wilk normality test

data:  X[[1L]]
W = 0.9802, p-value = 0.5611
```
You will need to extract the things you want from these objects, which all have the structure:
```
R> str(lshap[[1]])
List of 4
 $ statistic: Named num 0.98
  ..- attr(*, "names")= chr "W"
 $ p.value  : num 0.561
 $ method   : chr "Shapiro-Wilk normality test"
 $ data.name: chr "X[[1L]]"
 - attr(*, "class")= chr "htest"
```
If you want the statistic and p.value components of this object for all elements of lshap, we will use sapply() this time, to nicely arrange the results for us:
```
lres <- sapply(lshap, `[`, c("statistic","p.value"))

R> lres
          Gaussian Poisson Uniform 
statistic 0.9802   0.9371  0.918   
p.value   0.5611   0.01034 0.001998
```
Given that you have 500 of these, I'd transpose lres:
```
R> t(lres)
         statistic p.value 
Gaussian 0.9802    0.5611  
Poisson  0.9371    0.01034 
Uniform  0.918     0.001998
```
If you plan on doing anything with the p-values from this exercise, I suggest you start thinking about how to correct for multiple comparisons before you shoot yourself in the foot with a 30-cal.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...