How to simply count number of rows with NAs - R [duplicate]

末鹿安然 提交于 2019-12-11 01:07:43

问题


I'm trying to compute the number of rows with NA of the whole df as I'm looking to compute the % of rows with NA over the total number of rows of the df.

I have already have seen this post: Determine the number of rows with NAs but it just shows a specific range of columns.


回答1:


tl;dr: row wise, you'll want sum(!complete.cases(DF)), or, equivalently, sum(apply(DF, 1, anyNA))

There are a number of different ways to look at the number, proportion or position of NA values in a data frame:

Most of these start with the logical data frame with TRUE for every NA, and FALSE everywhere else. For the base dataset airquality

is.na(airquality)

There are 44 NA values in this data set

sum(is.na(airquality))
# [1] 44

You can look at the total number of NA values per row or column:

head(rowSums(is.na(airquality)))
# [1] 0 0 0 0 2 1
colSums(is.na(airquality))
#   Ozone Solar.R    Wind    Temp   Month     Day 
 37       7       0       0       0       0 

You can use anyNA() in place of is.na() as well:

# by row
head(apply(airquality, 1, anyNA))
# [1] FALSE FALSE FALSE FALSE  TRUE  TRUE
sum(apply(airquality, 1, anyNA))
# [1] 42


# by column
head(apply(airquality, 2, anyNA))
#   Ozone Solar.R    Wind    Temp   Month     Day 
#    TRUE    TRUE   FALSE   FALSE   FALSE   FALSE
sum(apply(airquality, 2, anyNA))
# [1] 2

complete.cases() can be used, but only row-wise:

sum(!complete.cases(airquality))
# [1] 42



回答2:


From the example here:

DF <- read.table(text="     col1   col2    col3
 1    23    17      NA
 2    55    NA      NA
 3    24    12      13
 4    34    23      12", header=TRUE)

You can check which rows have at least one NA:

(which_nas <- apply(DF, 1, function(X) any(is.na(X))))
#    1     2     3     4 
# TRUE  TRUE FALSE FALSE 

And then count them, identify them or get the ratio:

## Identify them
which(which_nas)
# 1 2 
# 1 2 

## Count them
length(which(which_nas))
#[1] 2

## Ratio
length(which(which_nas))/nrow(DF)
#[1] 0.5


来源:https://stackoverflow.com/questions/50919161/how-to-simply-count-number-of-rows-with-nas-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!