Quickly remove zero variance variables from a data.frame

后端 未结 8 781
独厮守ぢ
独厮守ぢ 2020-12-13 01:07

I have a large data.frame that was generated by a process outside my control, which may or may not contain variables with zero variance (i.e. all the observations are the sa

8条回答
  •  余生分开走
    2020-12-13 01:24

    How about using factor to count the number of unique elements and looping with sapply:

    dat[sapply(dat, function(x) length(levels(factor(x)))>1)]
       B  D F
    1  3 10 I
    2  4 10 J
    3  6 10 I
    4  9 10 J
    5  2 10 I
    6  9 10 J
    7  9 10 I
    8  7 10 J
    9  6 10 I
    10 1  1 J
    

    NAs are excluded by default, but this can be changed with the exclude parameter of factor:

    dat[sapply(dat, function(x) length(levels(factor(x,exclude=NULL)))>1)]
       B  D F  G
    1  3 10 I 10
    2  4 10 J 10
    3  6 10 I 10
    4  9 10 J 10
    5  2 10 I 10
    6  9 10 J 10
    7  9 10 I 10
    8  7 10 J 10
    9  6 10 I 10
    10 1  1 J NA
    

提交回复
热议问题