Calculating standard deviation of each row

前端 未结 2 587
心在旅途
心在旅途 2020-12-06 06:28

I am trying to use rowSds()to calculate each rows standard deviation so that I can pick the rows that have high sds to graph.

My data frame is called

相关标签:
2条回答
  • 2020-12-06 07:10

    You can use apply and transform functions

    set.seed(007)
    X <- data.frame(matrix(sample(c(10:20, NA), 100, replace=TRUE), ncol=10))
    transform(X, SD=apply(X,1, sd, na.rm = TRUE))
       X1 X2 X3 X4 X5 X6 X7 X8 X9 X10       SD
    1  NA 12 17 18 19 16 12 13 20  14 3.041381
    2  14 12 13 13 14 18 16 17 20  10 3.020302
    3  11 19 NA 12 19 19 19 20 12  20 3.865805
    4  10 11 20 12 15 17 18 17 18  12 3.496029
    5  12 15 NA 14 20 18 16 11 14  18 2.958040
    6  19 11 10 20 13 14 17 16 10  16 3.596294
    7  14 16 17 15 10 11 15 15 11  16 2.449490
    8  NA 10 15 19 19 12 15 15 19  14 3.201562
    9  11 NA NA 20 20 14 14 17 14  19 3.356763
    10 15 13 14 15 NA 13 15 NA 15  12 1.195229
    

    From ?apply you can see ... which allows using optional arguments to FUN, in this case you can use na.rm=TRUE to omit NA values.

    Using rowSds from matrixStats package also requires setting na.rm=TRUE to omit NA

    library(matrixStats)
    transform(X, SD=rowSds(X, na.rm=TRUE)) # same result as before.
    
    0 讨论(0)
  • 2020-12-06 07:11

    Also works, based on this answer

    set.seed(007)
    X <- data.frame(matrix(sample(c(10:20, NA), 100, replace=TRUE), ncol=10))
    
    vars_to_sum = grep("X", names(X), value=T)
    X %>% 
      group_by(row_number()) %>%
      do(data.frame(., 
                    SD = sd(unlist(.[vars_to_sum]), na.rm=T)))
    

    ...which appends a couple of row number columns, so probably better to explicitly add your row IDs for grouping.

    X %>% 
      mutate(ID = row_number()) %>%
      group_by(ID) %>%
      do(data.frame(., SD = sd(unlist(.[vars_to_sum]), na.rm=T)))
    

    This syntax also has the feature of being able to specify which columns you want to use.

    0 讨论(0)
提交回复
热议问题