Populate the NA values in a variable with values from a different variables in R

爱⌒轻易说出口 提交于 2019-12-13 05:07:22

问题


I have data which looks like this

Linking <- data.frame(
ID = c(round((runif(20, min = 10000, max = 99999)), digits = 0), NA, NA, NA, NA),
PSU = c(paste("A", round((runif(20, min = 10000, max = 99999)), digits = 0), sep = ''), NA, NA, NA, NA),
qtr = c(rep(1:10, 2), NA, NA, NA, NA)
)

Linking$Key <- paste(Linking$ID, Linking$PSU, Linking$qtr, sep = "_")
Linking$Key[c(21:24)] <- c("87654_A15467_1", "45623_A23456_2", "67891_A12345_4", "65346_A23987_7")

What I want to do is populate the NA values for ID, PSU, and qtr from the information from "Key", but only for the rows with NA values.

Does anyone know how to do this?

This code does what I want, but it does it for all values of each variable. I want to do this just for rows where the values are NA.

Linking2 <- Linking
Linking2$ID <- substr(Linking$Key,1,5)
Linking2$PSU <- substr(Linking$Key,7,12)
Linking2$qtr <- substr(Linking$Key, 14,15)

回答1:


The basic idea here is to assign using a logical index vector.

Linking$ID[is.na(Linking$ID)] <- substr(Linking$Key,1,5)[is.na(Linking$ID)]



回答2:


You can also use tidyr::separate with dplyr::coalesce to separate values from Key and fill NA values in the first three columns.

library(tidyverse);
Linking %>%
    separate(Key, into = paste0("tmp", 1:3), sep = "_") %>%
    mutate(ID = coalesce(tmp1), PSU = coalesce(tmp2), qtr = coalesce(tmp3)) %>%
    select(-tmp1, -tmp2, -tmp3);
#      ID    PSU qtr
#1  56421 A20914   1
#2  30912 A97582   2
#3  97547 A73397   3
#4  28277 A35255   4
#5  45107 A14867   5
#6  91099 A26196   6
#7  21117 A69721   7
#8  69065 A34741   8
#9  28781 A96847   9
#10 26098 A93942  10
#11 12680 A15705   1
#12 35231 A68219   2
#13 70556 A47693   3
#14 98281 A55697   4
#15 52687 A95023   5
#16 46129 A95540   6
#17 48721 A78827   7
#18 77989 A36778   8
#19 11757 A29458   9
#20 16575 A78892  10
#21 87654 A15467   1
#22 45623 A23456   2
#23 67891 A12345   4
#24 65346 A23987   7


来源:https://stackoverflow.com/questions/49621647/populate-the-na-values-in-a-variable-with-values-from-a-different-variables-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!