na | 易学教程

How to replace NAs of a variable with values from another dataframe

阅读更多关于 How to replace NAs of a variable with values from another dataframe

i hope this one isn´t stupid. I have two dataframes with Variables ID and gender/sex. In df1, there are NAs. In df2, the variable is complete. I want to complete the column in df1 with the values from df2. (In df1 the variable is called "gender". In df2 it is called "sex".) Here is what i tried so far: #example-data ID<-seq(1,30,by=1) df1<-as.data.frame(ID) df2<-df1 df1$gender<-c(NA,"2","1",NA,"2","2","2","2","2","2",NA,"2","1","1",NA,"2","2","2","2","2","1","2","2",NA,"2","2","2","2","2",NA) df2$sex<-c("2","2","1","2","2","2","2","2","2","2","2","2","1","1","2","2","2","2","2","2","1","2","2"

How can I keep NA when I change levels

阅读更多关于 How can I keep NA when I change levels

问题 I build a vector of factors containing NA. my_vec <- factor(c(NA,"a","b"),exclude=NULL) levels(my_vec) # [1] "a" "b" NA I change one of those levels. levels(my_vec)[levels(my_vec) == "b"] <- "c" NA disappears. levels(my_vec) # [1] "a" "c" How can I keep it ? EDIT @rawr gave a nice solution that can work most of the time, it works for my previous specific example, but not for the one I'll show below @Hack-R had a pragmatic option using addNA, I could make it work with that but I'd rather a

Fill in mean values for NA in every column of a data frame [duplicate]

阅读更多关于 Fill in mean values for NA in every column of a data frame [duplicate]

This question already has an answer here: Replace missing values with column mean 11 answers if I have a data frame df df=data.frame(x=1:20,y=c(1:10,rep(NA,10)),z=c(rep(NA,5),1:15)) I know to replace NAs with mean value for a given column is, we can use df[is.na(df$x)]=mean(df$x,na.rm=T) What I am trying to find is a way to use a single command so that it does this for the columns at once instead of repeating it for every column. Suspecting, I need to use sapply and function, I tried something like this but clearly this does not work sapply(df,function(x) df[is.na(df$x)]=mean(df$x,na.rm=T))

R: Replace elements with NA in a matrix in corresponding positions to NA's in another matrix

阅读更多关于 R: Replace elements with NA in a matrix in corresponding positions to NA's in another matrix

I have a large matrix, z, that I removed all values >3 and replaced with NA using: z[z>3]<-NA I have another matrix, y , of identical dimensions that I need to replace values with NA in positions corresponding to the locations where the elements were replaced in element z. That is, if z[3,12] was >3 and replaced with NA, I need y[3,12] to be replaced with NA too. They have the same row names if that helps. Just use is.na on the first matrix to select the values to replace in the second matrix. Example: set.seed(1) m1 <- matrix(sample(5, 25, TRUE), 5) m2 <- matrix(sample(5, 25, TRUE), 5) m1[m1

How to sort putting NAs first in dplyr? [duplicate]

阅读更多关于 How to sort putting NAs first in dplyr? [duplicate]

问题 This question already has answers here : How to have NA's displayed first using arrange() (2 answers) Closed 2 years ago . Consider the following example: require(tibble) require(dplyr) set.seed(42) tbl <- data_frame(id = letters[1:10], val = c(runif(5), NA, runif(4))) tbl # A tibble: 10 × 2 id val <chr> <dbl> 1 a 0.9148060435 2 b 0.9370754133 3 c 0.2861395348 4 d 0.8304476261 5 e 0.6417455189 6 f NA 7 g 0.5190959491 8 h 0.7365883146 9 i 0.1346665972 10 j 0.6569922904 I want to sort the

Column name of last non-NA row per row; using tidyverse solution?

阅读更多关于 Column name of last non-NA row per row; using tidyverse solution?

Brief Dataset description: I have survey data generated from Qualtrics, which I've imported into R as a tibble. Each column corresponds to a survey question, and I've preserved the original column order (to correspond with the order of the questions in the survey). Problem in plain language: Due to normal participant attrition, not all participants completed all of the questions in the survey. I want to know how far each participant got in the survey, and the last question they each answered before stopping. Problem statement in R: I want to generate (using tidyverse): 1) A new column ( lastq

Replace a value NA with the value from another column in R

阅读更多关于 Replace a value NA with the value from another column in R

问题 I want to replace the NA value in dfABy from the column A, with the value from the column B, based on the year of column year. For example, my df is: >dfABy A B Year 56 75 1921 NA 45 1921 NA 77 1922 67 41 1923 NA 65 1923 The result what I will attend is: > dfABy A B Year 56 75 1921 *45* 45 1921 *77* 77 1922 67 41 1923 *65* 65 1923 P.S: with the * the value replacing in column A from column B for every year 回答1: Perhaps the easiest to read/understand answer in R lexicon is to use ifelse. So

How to select rows by group with the minimum value and containing NAs in R

阅读更多关于 How to select rows by group with the minimum value and containing NAs in R

Here is an example: set.seed(123) data<-data.frame(X=rep(letters[1:3], each=4),Y=sample(1:12,12),Z=sample(1:100, 12)) data[data==3]<-NA What I am to realize is to select the unique row of X with minimum Y by ignoring NA s: a 4 68 b 1 4 c 2 64 What's the best way to do that? Using the data.table package, this is trivial: library(data.table) d <- data.table(data) d[, min(Y, na.rm=TRUE), by=X] You can also use plyr and its ddply function: library(plyr) ddply(data, .(X), summarise, min(Y, na.rm=TRUE)) Or using base R: aggregate(X ~ ., data=data, FUN=min) Based on the edits, I would use data.table

How to impute missing values with row mean in R

阅读更多关于 How to impute missing values with row mean in R

问题 From a large data frame, I have extracted a row of numeric data and saved as a vector. Some of the values are missing and marked as NA. I want to impute the missing values with row mean. Thanks 回答1: Let x be your vector: x <- c(NA,0,2,0,2,NA,NA,NA,0,2) ifelse(is.na(x), mean(x, na.rm = TRUE), x) # [1] 1 0 2 0 2 1 1 1 0 2 Or if you don't care for the original vector, you can modify it directly: x[is.na(x)] <- mean(x, na.rm = TRUE) 回答2: Use this: filter <- is.na(myVec) myVec[filter] <- colMeans

How to impute missing values with row mean in R

阅读更多关于 How to impute missing values with row mean in R

From a large data frame, I have extracted a row of numeric data and saved as a vector. Some of the values are missing and marked as NA. I want to impute the missing values with row mean. Thanks Let x be your vector: x <- c(NA,0,2,0,2,NA,NA,NA,0,2) ifelse(is.na(x), mean(x, na.rm = TRUE), x) # [1] 1 0 2 0 2 1 1 1 0 2 Or if you don't care for the original vector, you can modify it directly: x[is.na(x)] <- mean(x, na.rm = TRUE) Ferdinand.kraft Use this: filter <- is.na(myVec) myVec[filter] <- colMeans(myDF[,filter], na.rm=TRUE) Where myVec is your vector and myDF is your data.frame. 来源： https:/