Exclude Blank and NA in R [duplicate]

夙愿已清 提交于 2019-11-28 19:20:04

问题


Possible Duplicate:
R - remove rows with NAs in data.frame

I have a dataframe named sub.new with multiple columns in it. And I'm trying to exclude any cell containing NA or a blank space "".
I tried to use subset(), but it's targeting specific column conditional. Is there anyway to scan through the whole dataframe and create a subset that no cell is either NA or blank space ?

In the example below, only the first line should be kept:

# ID               SNP             ILMN_Strand   Customer_Strand
ID1234              [A/G]          TOP           BOT
Non-Specific        NSB (Bgnd)     Green
Non-Polymorphic     NP (A)         Red
Non-Polymorphic     NP (T)         Purple
Non-Polymorphic     NP (C)         Green
Non-Polymorphic     NP (G)         Blue
Restoration         Restore        Green

Any suggestions? Thanks


回答1:


A good idea is to set all of the "" (blank cells) to NA before any further analysis.

If you are reading your input from a file, it is a good choice to cast all "" to NAs:

foo <- read.table(file="Your_file.txt", na.strings=c("", "NA"), sep="\t") # if your file is tab delimited

If you have already your table loaded, you can act as follows:

foo[foo==""] <- NA

Then to keep only rows with no NA you may just use na.omit():

foo <- na.omit(foo)

Or to keep columns with no NA:

foo <- foo[, colSums(is.na(foo)) == 0] 



回答2:


Don't know exactly what kind of dataset you have, so I provide general answer.

x <- c(1,2,NA,3,4,5)
y <- c(1,2,3,NA,6,8)
my.data <- data.frame(x, y)
> my.data
   x  y
1  1  1
2  2  2
3 NA  3
4  3 NA
5  4  6
6  5  8
# Exclude rows with NA values
my.data[complete.cases(my.data),]
  x y
1 1 1
2 2 2
5 4 6
6 5 8


来源:https://stackoverflow.com/questions/12763890/exclude-blank-and-na-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!