Supposing a data set with several rows and columns with some columns being 0 (I mean all values in the column are 0\'s). How one can filter out those columns? I have tried w
I think in the solutions using all(x == 0)
it is slightly more efficient to use any(x!=0)
, because any
stops after the first instance of an element being !=0
, which will be important with growing number of rows.
To provide a different solution using plyr
and colwise
(dat
being the dput
data):
library(plyr)
f0 <- function(x) any(x!=0) & is.numeric(x)
colwise(identity, f0)(dat)
The idea is to go through every column in dat and return it (identity
), but only if f0
returns TRUE
, i.e. the column has at least one entry !=0
and the column is.numeric
EDIT:
To do this for every data.frame in your list, eg. training_data <- list(dat, dat, dat, dat)
training_data_clean <- lapply(training_data, function(z) colwise(identity, f0)(z))
sapply(training_data, dim)
[,1] [,2] [,3] [,4]
[1,] 6 6 6 6
[2,] 111 111 111 111
sapply(training_data_clean, dim)
[,1] [,2] [,3] [,4]
[1,] 6 6 6 6
[2,] 74 74 74 74
EDIT2: To retain the label column:
lapply(training_data, function(z) cbind(label = z$label, colwise(identity, f0)(z)))