问题
I'm trying to compute the number of rows with NA of the whole df as I'm looking to compute the % of rows with NA over the total number of rows of the df.
I have already have seen this post: Determine the number of rows with NAs but it just shows a specific range of columns.
回答1:
tl;dr: row wise, you'll want sum(!complete.cases(DF))
, or, equivalently, sum(apply(DF, 1, anyNA))
There are a number of different ways to look at the number, proportion or position of NA
values in a data frame:
Most of these start with the logical data frame with TRUE
for every NA
, and FALSE
everywhere else. For the base dataset airquality
is.na(airquality)
There are 44 NA
values in this data set
sum(is.na(airquality))
# [1] 44
You can look at the total number of NA
values per row or column:
head(rowSums(is.na(airquality)))
# [1] 0 0 0 0 2 1
colSums(is.na(airquality))
# Ozone Solar.R Wind Temp Month Day
37 7 0 0 0 0
You can use anyNA()
in place of is.na()
as well:
# by row
head(apply(airquality, 1, anyNA))
# [1] FALSE FALSE FALSE FALSE TRUE TRUE
sum(apply(airquality, 1, anyNA))
# [1] 42
# by column
head(apply(airquality, 2, anyNA))
# Ozone Solar.R Wind Temp Month Day
# TRUE TRUE FALSE FALSE FALSE FALSE
sum(apply(airquality, 2, anyNA))
# [1] 2
complete.cases()
can be used, but only row-wise:
sum(!complete.cases(airquality))
# [1] 42
回答2:
From the example here:
DF <- read.table(text=" col1 col2 col3
1 23 17 NA
2 55 NA NA
3 24 12 13
4 34 23 12", header=TRUE)
You can check which rows have at least one NA:
(which_nas <- apply(DF, 1, function(X) any(is.na(X))))
# 1 2 3 4
# TRUE TRUE FALSE FALSE
And then count them, identify them or get the ratio:
## Identify them
which(which_nas)
# 1 2
# 1 2
## Count them
length(which(which_nas))
#[1] 2
## Ratio
length(which(which_nas))/nrow(DF)
#[1] 0.5
来源:https://stackoverflow.com/questions/50919161/how-to-simply-count-number-of-rows-with-nas-r