I am trying to use rowSds()to calculate each rows standard deviation so that I can pick the rows that have high sds to graph.
My data frame is called
You can use apply and transform functions
set.seed(007)
X <- data.frame(matrix(sample(c(10:20, NA), 100, replace=TRUE), ncol=10))
transform(X, SD=apply(X,1, sd, na.rm = TRUE))
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 SD
1 NA 12 17 18 19 16 12 13 20 14 3.041381
2 14 12 13 13 14 18 16 17 20 10 3.020302
3 11 19 NA 12 19 19 19 20 12 20 3.865805
4 10 11 20 12 15 17 18 17 18 12 3.496029
5 12 15 NA 14 20 18 16 11 14 18 2.958040
6 19 11 10 20 13 14 17 16 10 16 3.596294
7 14 16 17 15 10 11 15 15 11 16 2.449490
8 NA 10 15 19 19 12 15 15 19 14 3.201562
9 11 NA NA 20 20 14 14 17 14 19 3.356763
10 15 13 14 15 NA 13 15 NA 15 12 1.195229
From ?apply you can see ... which allows using optional arguments to FUN, in this case you can use na.rm=TRUE to omit NA values.
Using rowSds from matrixStats package also requires setting na.rm=TRUE to omit NA
library(matrixStats)
transform(X, SD=rowSds(X, na.rm=TRUE)) # same result as before.
Also works, based on this answer
set.seed(007)
X <- data.frame(matrix(sample(c(10:20, NA), 100, replace=TRUE), ncol=10))
vars_to_sum = grep("X", names(X), value=T)
X %>%
group_by(row_number()) %>%
do(data.frame(.,
SD = sd(unlist(.[vars_to_sum]), na.rm=T)))
...which appends a couple of row number columns, so probably better to explicitly add your row IDs for grouping.
X %>%
mutate(ID = row_number()) %>%
group_by(ID) %>%
do(data.frame(., SD = sd(unlist(.[vars_to_sum]), na.rm=T)))
This syntax also has the feature of being able to specify which columns you want to use.