R: Remove multiple empty columns of character variables

后端 未结 9 1223
轮回少年
轮回少年 2020-12-08 11:06

I have a data frame where all the variables are of character type. Many of the columns are completely empty, i.e. only the variable headers are there, but no values. Is ther

相关标签:
9条回答
  • 2020-12-08 11:42

    This can also be done by dplyr and select_if

    `select_if(df,function(x){any(!is.na(x))})`
    

    or with is.null() or x=="" depending on how empty values are defined in your data.

    0 讨论(0)
  • 2020-12-08 11:48

    If you're talking about columns where all values are NA, use remove_empty("cols") from the janitor package.

    If you have character vectors where every value is the empty string "", you can first convert those values to NA throughout your data.frame with na_if from the dplyr package:

    dat <- data.frame(
      x = c("a", "b", "c"),
      y = c("", "", ""),
      z = c(NA, NA, NA),
      stringsAsFactors = FALSE
    )
    
    dat
    #>   x y  z
    #> 1 a   NA
    #> 2 b   NA
    #> 3 c   NA
    
    library(dplyr)
    library(janitor)
    
    dat %>%
      mutate_all(funs(na_if(., ""))) %>%
      remove_empty("cols")
    #>   x
    #> 1 a
    #> 2 b
    #> 3 c
    
    0 讨论(0)
  • 2020-12-08 11:54

    If your empty columns are really empty character columns, something like the following should work. It will need to be modified if your "empty" character columns include, say, spaces.

    Sample data:

    mydf <- data.frame(
      A = c("a", "b"),
      B = c("y", ""),
      C = c("", ""),
      D = c("", ""),
      E = c("", "z")
    )
    mydf
    #   A B C D E
    # 1 a y      
    # 2 b       z
    

    Identifying and removing the "empty" columns.

    mydf[!sapply(mydf, function(x) all(x == ""))]
    #   A B E
    # 1 a y  
    # 2 b   z
    

    Alternatively, as recommended by @Roland:

    > mydf[, colSums(mydf != "") != 0]
      A B E
    1 a y  
    2 b   z
    
    0 讨论(0)
提交回复
热议问题