na

Behavior of summing !is.na() results

放肆的年华 提交于 2019-12-17 19:33:51
问题 Why does the first line return TRUE, and the third line returns 1? I would expect both lines to return 1. What is the exact meaning of those extra two parentheses in the third line? !is.na(5) + !is.na(NA) # TRUE (!is.na(5)) + (!is.na(NA)) # 1 edit: should check these multiple times. The original problem was with !is.na() , thought it replicated for is.na() . But it didn't :) 回答1: ! has a weird, counter-intuitive precedence in R. Your first code is equivalent to !(is.na(5) + !is.na(NA)) That

Using ifelse() to replace NAs in one data frame by referencing another data frame of different length

徘徊边缘 提交于 2019-12-17 18:58:50
问题 I already reviewed the following two posts and think they might answer my question, although I'm struggling to see how: 1) Conditional replacement of values in a data.frame 2) Creating a function to replace NAs from one data.frame with values from another With that said, I'm trying to replace NAs in one data frame by referencing another data frame of a different (shorter) length and pulling in replacement values from column "B" where the values for column "A" in each data frame match. I've

Remove NA/NaN/Inf in a matrix

拥有回忆 提交于 2019-12-17 16:37:05
问题 I want to try two things : How do I remove rows that contain NA/NaN/Inf How do I set value of data point from NA/NaN/Inf to 0. So far, I have tried using the following for NA values, but been getting warnings. > eg <- data[rowSums(is.na(data)) == 0,] Error in rowSums(is.na(data)) : 'x' must be an array of at least two dimensions In addition: Warning message: In is.na(data) : is.na() applied to non-(list or vector) of type 'closure' 回答1: I guess I'll throw my hat into the ring with my

R count NA by group

不打扰是莪最后的温柔 提交于 2019-12-17 13:57:31
问题 Could someone please explain why I get different answers using the aggregate function to count missing values by group? Also, is there a better way to count missing values by group using a native R function? DF <- data.frame(YEAR=c(2000,2000,2000,2001,2001,2001,2001,2002,2002,2002), X=c(1,NA,3,NA,NA,NA,7,8,9,10)) DF aggregate(X ~ YEAR, data=DF, function(x) { sum(is.na(x)) }) with(DF, aggregate(X, list(YEAR), function(x) { sum(is.na(x)) })) aggregate(X ~ YEAR, data=DF, function(x) { sum(! is

Omit rows containing specific column of NA

橙三吉。 提交于 2019-12-17 07:01:08
问题 I want to know how to omit NA values in a data frame, but only in some columns I am interested in. For example, DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22)) but I only want to omit the data where y is NA , therefore the result should be x y z 1 1 0 NA 2 2 10 33 na.omit seems delete all rows contain any NA . Can somebody help me out of this simple question? But if now I change the question like: DF <- data.frame(x = c(1, 2, 3,NA), y = c(1,0, 10, NA), z=c(43,NA, 33, NA))

Randomly insert NAs into dataframe proportionaly

穿精又带淫゛_ 提交于 2019-12-17 06:53:52
问题 I have a complete dataframe. I want to 20% of the values in the dataframe to be replaced by NAs to simulate random missing data. A <- c(1:10) B <- c(11:20) C <- c(21:30) df<- data.frame(A,B,C) Can anyone suggest a quick way of doing that? 回答1: df <- data.frame(A = 1:10, B = 11:20, c = 21:30) head(df) ## A B c ## 1 1 11 21 ## 2 2 12 22 ## 3 3 13 23 ## 4 4 14 24 ## 5 5 15 25 ## 6 6 16 26 as.data.frame(lapply(df, function(cc) cc[ sample(c(TRUE, NA), prob = c(0.85, 0.15), size = length(cc),

Combine column to remove NA's

本小妞迷上赌 提交于 2019-12-17 06:35:28
问题 I have some columns in R and for each row there will only ever be a value in one of them, the rest will be NA's. I want to combine these into one column with the non-NA value. Does anyone know of an easy way of doing this. For example I could have as follows: data <- data.frame('a' = c('A','B','C','D','E'), 'x' = c(1,2,NA,NA,NA), 'y' = c(NA,NA,3,NA,NA), 'z' = c(NA,NA,NA,4,5)) So I would have 'a' 'x' 'y' 'z' A 1 NA NA B 2 NA NA C NA 3 NA D NA NA 4 E NA NA 5 And I would to get 'a' 'mycol' A 1 B

replace NA value with the group value

匆匆过客 提交于 2019-12-17 06:16:33
问题 I have a df as follows which has 20 people across 5 households. Some people within the household have missing data for whether they have a med_card or not. I want to give these people the same value as the other people in their household (not an NA value, a real binary value which is either 0 or 1). I have tried the following code, which is a step in the right direction I think - but isn't 100% correct because a) it doesn't work if the first value for med_card per household is NA and b) it

Subsetting R data frame results in mysterious NA rows

☆樱花仙子☆ 提交于 2019-12-17 04:24:48
问题 I've been encountering what I think is a bug. It's not a big deal, but I'm curious if anyone else has seen this. Unfortunately, my data is confidential, so I have to make up an example, and it's not going to be very helpful. When subsetting my data, I occassionally get mysterious NA rows that aren't in my original data frame. Even the rownames are NA. EG: example <- data.frame("var1"=c("A", "B", "A"), "var2"=c("X", "Y", "Z")) example var1 var2 1 A X 2 B Y 3 A Z then I run: example[example

Remove NA values from a vector

痞子三分冷 提交于 2019-12-17 02:19:07
问题 I have a huge vector which has a couple of NA values, and I'm trying to find the max value in that vector (the vector is all numbers), but I can't do this because of the NA values. How can I remove the NA values so that I can compute the max? 回答1: Trying ?max , you'll see that it actually has a na.rm = argument, set by default to FALSE . (That's the common default for many other R functions, including sum() , mean() , etc.) Setting na.rm=TRUE does just what you're asking for: d <- c(1, 100,