Find columns with all missing values

后端 未结 8 460
Happy的楠姐
Happy的楠姐 2020-12-09 08:42

I am writing a function, which needs a check on whether (and which!) column (variable) has all missing values (NA, ). The following is fr

相关标签:
8条回答
  • 2020-12-09 08:54

    This is easy enough to with sapply and a small anonymous function:

    sapply(test1, function(x)all(is.na(x)))
       X1    X2    X3 
    FALSE FALSE FALSE 
    
    sapply(test2, function(x)all(is.na(x)))
       X1    X2    X3 
    FALSE  TRUE FALSE 
    

    And inside a function:

    na.test <-  function (x) {
      w <- sapply(x, function(x)all(is.na(x)))
      if (any(w)) {
        stop(paste("All NA in columns", paste(which(w), collapse=", ")))
      }
    }
    
    na.test(test1)
    
    na.test(test2)
    Error in na.test(test2) : All NA in columns 2
    
    0 讨论(0)
  • 2020-12-09 08:54

    To test whether columns have all missing values:

    apply(test1,2,function(x) {all(is.na(x))})
    

    To get which columns have all missing values:

      test1.nona <- test1[ , colSums(is.na(test1)) == 0]
    
    0 讨论(0)
  • 2020-12-09 08:55

    This one will generate the column names that are full of NAs:

    library(purrr)
    df %>% keep(~all(is.na(.x))) %>% names
    
    0 讨论(0)
  • 2020-12-09 08:58

    In dplyr

    ColNums_NotAllMissing <- function(df){ # helper function
      as.vector(which(colSums(is.na(df)) != nrow(df)))
    }
    
    df %>%
    select(ColNums_NotAllMissing(.))
    
    example:
    x <- data.frame(x = c(NA, NA, NA), y = c(1, 2, NA), z = c(5, 6, 7))
    
    x %>%
    select(ColNums_NotAllMissing(.))
    

    or, the other way around

    Cols_AllMissing <- function(df){ # helper function
      as.vector(which(colSums(is.na(df)) == nrow(df)))
    }
    
    
    x %>%
      select(-Cols_AllMissing(.))
    
    0 讨论(0)
  • 2020-12-09 09:00

    To find the columns with all values missing

     allmisscols <- apply(dataset,2, function(x)all(is.na(x)));  
     colswithallmiss <-names(allmisscols[allmisscols>0]);    
     print("the columns with all values missing");    
     print(colswithallmiss);
    
    0 讨论(0)
  • 2020-12-09 09:00

    The following command gives you a nice table with the columns that have NA values:

    sapply(dataframe, function(x)all(any(is.na(x))))
    

    It's an improvement for the first answer you got, which doesn't work properly from some cases.

    0 讨论(0)
提交回复
热议问题