na | 易学教程

Behavior of summing !is.na() results

阅读更多关于 Behavior of summing !is.na() results

问题 Why does the first line return TRUE, and the third line returns 1? I would expect both lines to return 1. What is the exact meaning of those extra two parentheses in the third line? !is.na(5) + !is.na(NA) # TRUE (!is.na(5)) + (!is.na(NA)) # 1 edit: should check these multiple times. The original problem was with !is.na() , thought it replicated for is.na() . But it didn't :) 回答1: ! has a weird, counter-intuitive precedence in R. Your first code is equivalent to !(is.na(5) + !is.na(NA)) That

Using ifelse() to replace NAs in one data frame by referencing another data frame of different length

阅读更多关于 Using ifelse() to replace NAs in one data frame by referencing another data frame of different length

问题 I already reviewed the following two posts and think they might answer my question, although I'm struggling to see how: 1) Conditional replacement of values in a data.frame 2) Creating a function to replace NAs from one data.frame with values from another With that said, I'm trying to replace NAs in one data frame by referencing another data frame of a different (shorter) length and pulling in replacement values from column "B" where the values for column "A" in each data frame match. I've

Remove NA/NaN/Inf in a matrix

阅读更多关于 Remove NA/NaN/Inf in a matrix

问题 I want to try two things : How do I remove rows that contain NA/NaN/Inf How do I set value of data point from NA/NaN/Inf to 0. So far, I have tried using the following for NA values, but been getting warnings. > eg <- data[rowSums(is.na(data)) == 0,] Error in rowSums(is.na(data)) : 'x' must be an array of at least two dimensions In addition: Warning message: In is.na(data) : is.na() applied to non-(list or vector) of type 'closure' 回答1: I guess I'll throw my hat into the ring with my

R count NA by group

阅读更多关于 R count NA by group

问题 Could someone please explain why I get different answers using the aggregate function to count missing values by group? Also, is there a better way to count missing values by group using a native R function? DF <- data.frame(YEAR=c(2000,2000,2000,2001,2001,2001,2001,2002,2002,2002), X=c(1,NA,3,NA,NA,NA,7,8,9,10)) DF aggregate(X ~ YEAR, data=DF, function(x) { sum(is.na(x)) }) with(DF, aggregate(X, list(YEAR), function(x) { sum(is.na(x)) })) aggregate(X ~ YEAR, data=DF, function(x) { sum(! is

Omit rows containing specific column of NA

阅读更多关于 Omit rows containing specific column of NA

问题 I want to know how to omit NA values in a data frame, but only in some columns I am interested in. For example, DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22)) but I only want to omit the data where y is NA , therefore the result should be x y z 1 1 0 NA 2 2 10 33 na.omit seems delete all rows contain any NA . Can somebody help me out of this simple question? But if now I change the question like: DF <- data.frame(x = c(1, 2, 3,NA), y = c(1,0, 10, NA), z=c(43,NA, 33, NA))

Randomly insert NAs into dataframe proportionaly

阅读更多关于 Randomly insert NAs into dataframe proportionaly

问题 I have a complete dataframe. I want to 20% of the values in the dataframe to be replaced by NAs to simulate random missing data. A <- c(1:10) B <- c(11:20) C <- c(21:30) df<- data.frame(A,B,C) Can anyone suggest a quick way of doing that? 回答1: df <- data.frame(A = 1:10, B = 11:20, c = 21:30) head(df) ## A B c ## 1 1 11 21 ## 2 2 12 22 ## 3 3 13 23 ## 4 4 14 24 ## 5 5 15 25 ## 6 6 16 26 as.data.frame(lapply(df, function(cc) cc[ sample(c(TRUE, NA), prob = c(0.85, 0.15), size = length(cc),

Combine column to remove NA's

阅读更多关于 Combine column to remove NA's

问题 I have some columns in R and for each row there will only ever be a value in one of them, the rest will be NA's. I want to combine these into one column with the non-NA value. Does anyone know of an easy way of doing this. For example I could have as follows: data <- data.frame('a' = c('A','B','C','D','E'), 'x' = c(1,2,NA,NA,NA), 'y' = c(NA,NA,3,NA,NA), 'z' = c(NA,NA,NA,4,5)) So I would have 'a' 'x' 'y' 'z' A 1 NA NA B 2 NA NA C NA 3 NA D NA NA 4 E NA NA 5 And I would to get 'a' 'mycol' A 1 B

replace NA value with the group value

阅读更多关于 replace NA value with the group value

问题 I have a df as follows which has 20 people across 5 households. Some people within the household have missing data for whether they have a med_card or not. I want to give these people the same value as the other people in their household (not an NA value, a real binary value which is either 0 or 1). I have tried the following code, which is a step in the right direction I think - but isn't 100% correct because a) it doesn't work if the first value for med_card per household is NA and b) it

Subsetting R data frame results in mysterious NA rows

阅读更多关于 Subsetting R data frame results in mysterious NA rows

问题 I've been encountering what I think is a bug. It's not a big deal, but I'm curious if anyone else has seen this. Unfortunately, my data is confidential, so I have to make up an example, and it's not going to be very helpful. When subsetting my data, I occassionally get mysterious NA rows that aren't in my original data frame. Even the rownames are NA. EG: example <- data.frame("var1"=c("A", "B", "A"), "var2"=c("X", "Y", "Z")) example var1 var2 1 A X 2 B Y 3 A Z then I run: example[example

Remove NA values from a vector

阅读更多关于 Remove NA values from a vector

问题 I have a huge vector which has a couple of NA values, and I'm trying to find the max value in that vector (the vector is all numbers), but I can't do this because of the NA values. How can I remove the NA values so that I can compute the max? 回答1: Trying ?max , you'll see that it actually has a na.rm = argument, set by default to FALSE . (That's the common default for many other R functions, including sum() , mean() , etc.) Setting na.rm=TRUE does just what you're asking for: d <- c(1, 100,