na | 易学教程

How to remove row if it has a NA value in one certain column

阅读更多关于 How to remove row if it has a NA value in one certain column

问题 My data called "dat": A B C NA 2 NA 1 2 3 1 NA 3 1 2 3 I want to be all rows to be removed if it has an NA in column B: A B C NA 2 NA 1 2 3 1 2 3 na.omit(dat) removes all rows with an NA not just the ones where the NA is in column B. Also I'd like to know how to this for NA value in two columns. I appreciate all advice! 回答1: The easiest solution is to use is.na() : df[!is.na(df$B), ] which gives you: A B C 1 NA 2 NA 2 1 2 3 4 1 2 3 回答2: there is an elegant solution if you use the tidyverse !

How to remove row if it has a NA value in one certain column

阅读更多关于 How to remove row if it has a NA value in one certain column

How to remove row if it has a NA value in one certain column

阅读更多关于 How to remove row if it has a NA value in one certain column

Keep values in data frame= Na (sodium in chemistry) as is

阅读更多关于 Keep values in data frame= Na (sodium in chemistry) as is

问题 Original df (clinical chemistry) Subject Code Test Value Units Flag 1 NA NA 147 mmol/L 2 NA/K NA/K 10.5 RATIO 3 K K 4.7 mmol/L 4 CK CK 235 UL ... Ideal df after cleaning Subject Code Test Value Units Flag 1 NA Sodium 147 mmol/L NA 2 NA/K Sodium Potassium 10.5 RATIO NA 3 K Potassium 4.7 mmol/L NA 4 CK Creatine Kinase 235 UL NA ... What I have tried df <- read.csv(file="clinchemistry.csv", header = TRUE, sep=",", stringsAsFactors = FALSE) df$df[df8$Test == "NA"] <- "Sodium" df$df[df8$Code ==

Create a line plot using categorical data and not connecting the lines

阅读更多关于 Create a line plot using categorical data and not connecting the lines

问题 Trying to create a graph where both x and y are factors but I don't want the lines to be connected if there is a gap. How can I achieve this? library(ggplot2) df <- data.frame(x = c('a', 'b', 'c', 'd', 'e'), y = c('a', 'a', NA, 'a', 'a')) ggplot(df, aes(x = x, y = y, group = y)) + geom_point() + geom_line() Dont want the NA in the plot and there shouldn't be a line between b and d. 回答1: This may need extra work with your full dataset but one approach is to create a grouping variable to use in

Replace NA values if last and next non-NA value are the same

阅读更多关于 Replace NA values if last and next non-NA value are the same

问题 I am trying to fill missing data based on whether the previous and last NA value are the same. For example, this is the dummy dataset: df <- data.frame(ID = c(rep(1, 6), rep(2, 6), rep(3, 6), rep(4, 6), rep(5, 6), rep(6, 6), rep(7, 6), rep(8, 6), rep(9, 6), rep(10, 6)), with_missing = c("a", "a", NA, NA, "a", "a", "a", "a", NA, "b", "b", "b", "a", NA, NA, NA, "c", "c", "b", NA, "a", "a", "a", "a", "a", NA, NA, NA, NA, "a", "a", "a", NA, "b", "a", "a", "a", "a", NA, NA, "a", "a", "a", "a", NA,

Calculating Sum Column and ignoring Na [duplicate]

阅读更多关于 Calculating Sum Column and ignoring Na [duplicate]

问题 This question already has answers here : ignore NA in dplyr row sum (5 answers) Closed 2 years ago . I am trying to create a Total sum column that adds up the values of the previous columns. However I am having difficulty if there is an NA. If there is an NA in the row, my script will not calculate the sum. How do I edit the following script to essentially count the NA's as 0, or just ignore them completely but still calculate the sum. I don't want to actually change the NA to 0. CTDB %>%

Backward replacement of NAs in time series only to a limited number of observations

阅读更多关于 Backward replacement of NAs in time series only to a limited number of observations

问题 In a data table I want to perform a forward and backward gap-filling procedure over a period of 3 days in both directions. # Example data: library(data.table) library(zoo) dt <- data.table(Value = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.1359223, NA, NA, NA, NA, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0000000, NA)) > dt Value 1: NA 2: NA 3: NA 4: NA 5: NA 6: NA 7: NA 8: NA 9: NA 10: 0.1359223 11: NA 12: NA 13: NA 14: NA 15: 0.0000000 16: 0.0000000 17: 0.0000000 18: 0.0000000 19: 0

randomForest Error: NA not permitted in predictors (but no NAs in data)

阅读更多关于 randomForest Error: NA not permitted in predictors (but no NAs in data)

问题 So I am attempting to run the 'genie3' algorithm (ref: http://homepages.inf.ed.ac.uk/vhuynht/software.html) in R which uses the 'randomForest' method. I am running into the following Error: > weight.matrix<-get.weight.matrix(tmpLog2FC, input.idx=1:4551) Starting RF computations with 1000 trees/target gene, and 67 candidate input genes/tree node Computing gene 1/11805 Show Traceback Rerun with Debug Error in randomForest.default(x, y, mtry = mtry, ntree = nb.trees, importance = TRUE, : NA not

Identify sets of NA in a vector

阅读更多关于 Identify sets of NA in a vector

问题 Let's say I have a vector x : x <- c(NA, NA, 1, 2, NA, NA, 3, 4) How do I identify sets of the NAs within this vector, i.e., na_set <- c(1, 1, 0, 0, 2, 2, 0, 0) My end goal is to use it with a pipe on a data frame using dplyr . So, if there's a function compatible with dplyr that's even better. Thank you! 回答1: Compute the run length encoding of is.na(x) and replace the values with sequence numbers or 0. Then invert back. r <- rle(is.na(x)) r$values <- cumsum(r$values) * r$values inverse.rle(r