Delete columns/rows with more than x% missing

前端 未结 2 1504
孤街浪徒
孤街浪徒 2020-11-27 05:36

I want to delete all columns or rows with more than 50% NAs in a data frame.

This is my solution:

# delete columns with more than 50% mi         


        
2条回答
  •  囚心锁ツ
    2020-11-27 06:06

    To remove columns with some amount of NA, you can use colMeans(is.na(...))

    ## Some sample data
    set.seed(0)
    dat <- matrix(1:100, 10, 10)
    dat[sample(1:100, 50)] <- NA
    dat <- data.frame(dat)
    
    ## Remove columns with more than 50% NA
    dat[, which(colMeans(!is.na(dat)) > 0.5)]
    
    ## Remove rows with more than 50% NA
    dat[which(rowMeans(!is.na(dat)) > 0.5), ]
    
    ## Remove columns and rows with more than 50% NA
    dat[which(rowMeans(!is.na(dat)) > 0.5), which(colMeans(!is.na(dat)) > 0.5)]
    

提交回复
热议问题