na

How to replace NAs of a variable with values from another dataframe

拟墨画扇 提交于 2019-12-02 08:15:55
i hope this one isn´t stupid. I have two dataframes with Variables ID and gender/sex. In df1, there are NAs. In df2, the variable is complete. I want to complete the column in df1 with the values from df2. (In df1 the variable is called "gender". In df2 it is called "sex".) Here is what i tried so far: #example-data ID<-seq(1,30,by=1) df1<-as.data.frame(ID) df2<-df1 df1$gender<-c(NA,"2","1",NA,"2","2","2","2","2","2",NA,"2","1","1",NA,"2","2","2","2","2","1","2","2",NA,"2","2","2","2","2",NA) df2$sex<-c("2","2","1","2","2","2","2","2","2","2","2","2","1","1","2","2","2","2","2","2","1","2","2"

How can I keep NA when I change levels

丶灬走出姿态 提交于 2019-12-02 07:13:39
问题 I build a vector of factors containing NA. my_vec <- factor(c(NA,"a","b"),exclude=NULL) levels(my_vec) # [1] "a" "b" NA I change one of those levels. levels(my_vec)[levels(my_vec) == "b"] <- "c" NA disappears. levels(my_vec) # [1] "a" "c" How can I keep it ? EDIT @rawr gave a nice solution that can work most of the time, it works for my previous specific example, but not for the one I'll show below @Hack-R had a pragmatic option using addNA, I could make it work with that but I'd rather a

Fill in mean values for NA in every column of a data frame [duplicate]

。_饼干妹妹 提交于 2019-12-02 04:54:46
This question already has an answer here: Replace missing values with column mean 11 answers if I have a data frame df df=data.frame(x=1:20,y=c(1:10,rep(NA,10)),z=c(rep(NA,5),1:15)) I know to replace NAs with mean value for a given column is, we can use df[is.na(df$x)]=mean(df$x,na.rm=T) What I am trying to find is a way to use a single command so that it does this for the columns at once instead of repeating it for every column. Suspecting, I need to use sapply and function, I tried something like this but clearly this does not work sapply(df,function(x) df[is.na(df$x)]=mean(df$x,na.rm=T))

R: Replace elements with NA in a matrix in corresponding positions to NA's in another matrix

梦想与她 提交于 2019-12-02 04:31:49
I have a large matrix, z, that I removed all values >3 and replaced with NA using: z[z>3]<-NA I have another matrix, y , of identical dimensions that I need to replace values with NA in positions corresponding to the locations where the elements were replaced in element z. That is, if z[3,12] was >3 and replaced with NA, I need y[3,12] to be replaced with NA too. They have the same row names if that helps. Just use is.na on the first matrix to select the values to replace in the second matrix. Example: set.seed(1) m1 <- matrix(sample(5, 25, TRUE), 5) m2 <- matrix(sample(5, 25, TRUE), 5) m1[m1

How to sort putting NAs first in dplyr? [duplicate]

喜你入骨 提交于 2019-12-02 03:46:28
问题 This question already has answers here : How to have NA's displayed first using arrange() (2 answers) Closed 2 years ago . Consider the following example: require(tibble) require(dplyr) set.seed(42) tbl <- data_frame(id = letters[1:10], val = c(runif(5), NA, runif(4))) tbl # A tibble: 10 × 2 id val <chr> <dbl> 1 a 0.9148060435 2 b 0.9370754133 3 c 0.2861395348 4 d 0.8304476261 5 e 0.6417455189 6 f NA 7 g 0.5190959491 8 h 0.7365883146 9 i 0.1346665972 10 j 0.6569922904 I want to sort the

Column name of last non-NA row per row; using tidyverse solution?

假装没事ソ 提交于 2019-12-02 02:42:13
Brief Dataset description: I have survey data generated from Qualtrics, which I've imported into R as a tibble. Each column corresponds to a survey question, and I've preserved the original column order (to correspond with the order of the questions in the survey). Problem in plain language: Due to normal participant attrition, not all participants completed all of the questions in the survey. I want to know how far each participant got in the survey, and the last question they each answered before stopping. Problem statement in R: I want to generate (using tidyverse): 1) A new column ( lastq

Replace a value NA with the value from another column in R

拜拜、爱过 提交于 2019-12-01 22:29:24
问题 I want to replace the NA value in dfABy from the column A, with the value from the column B, based on the year of column year. For example, my df is: >dfABy A B Year 56 75 1921 NA 45 1921 NA 77 1922 67 41 1923 NA 65 1923 The result what I will attend is: > dfABy A B Year 56 75 1921 *45* 45 1921 *77* 77 1922 67 41 1923 *65* 65 1923 P.S: with the * the value replacing in column A from column B for every year 回答1: Perhaps the easiest to read/understand answer in R lexicon is to use ifelse. So

How to select rows by group with the minimum value and containing NAs in R

六眼飞鱼酱① 提交于 2019-12-01 22:12:12
Here is an example: set.seed(123) data<-data.frame(X=rep(letters[1:3], each=4),Y=sample(1:12,12),Z=sample(1:100, 12)) data[data==3]<-NA What I am to realize is to select the unique row of X with minimum Y by ignoring NA s: a 4 68 b 1 4 c 2 64 What's the best way to do that? Using the data.table package, this is trivial: library(data.table) d <- data.table(data) d[, min(Y, na.rm=TRUE), by=X] You can also use plyr and its ddply function: library(plyr) ddply(data, .(X), summarise, min(Y, na.rm=TRUE)) Or using base R: aggregate(X ~ ., data=data, FUN=min) Based on the edits, I would use data.table

How to impute missing values with row mean in R

守給你的承諾、 提交于 2019-12-01 21:24:53
问题 From a large data frame, I have extracted a row of numeric data and saved as a vector. Some of the values are missing and marked as NA. I want to impute the missing values with row mean. Thanks 回答1: Let x be your vector: x <- c(NA,0,2,0,2,NA,NA,NA,0,2) ifelse(is.na(x), mean(x, na.rm = TRUE), x) # [1] 1 0 2 0 2 1 1 1 0 2 Or if you don't care for the original vector, you can modify it directly: x[is.na(x)] <- mean(x, na.rm = TRUE) 回答2: Use this: filter <- is.na(myVec) myVec[filter] <- colMeans

How to impute missing values with row mean in R

心已入冬 提交于 2019-12-01 21:22:59
From a large data frame, I have extracted a row of numeric data and saved as a vector. Some of the values are missing and marked as NA. I want to impute the missing values with row mean. Thanks Let x be your vector: x <- c(NA,0,2,0,2,NA,NA,NA,0,2) ifelse(is.na(x), mean(x, na.rm = TRUE), x) # [1] 1 0 2 0 2 1 1 1 0 2 Or if you don't care for the original vector, you can modify it directly: x[is.na(x)] <- mean(x, na.rm = TRUE) Ferdinand.kraft Use this: filter <- is.na(myVec) myVec[filter] <- colMeans(myDF[,filter], na.rm=TRUE) Where myVec is your vector and myDF is your data.frame. 来源: https:/