how do I replace numeric codes with value labels from a lookup table?

ぐ巨炮叔叔 提交于 2019-11-27 13:29:47

问题


This question is related to this question, but not quite the same.

Say I have this data frame,

df <- data.frame(
                id = c(1:6),
                profession = c(1, 5, 4, NA, 0, 5))

and a string with human readable information about the profession codes. Say,

profession.code <- c(
                     Optometrists=1, Accountants=2, Veterinarians=3, 
                     `Financial analysts`=4,  Nurses=5)

Now, I'm looking for the easiest way to replace the values in df$profession with the text found in profession.code. Preferably without use of special libraries, unless it shortens the code significantly.

I would like my end result to be

df <- data.frame(
                id = c(1:6),
                profession = c("Optometrists", "Nurses", 
                "Financial analysts", NA, 0, "Nurses"))

Any help would be greatly appreciated.

Thanks, Eric


回答1:


You can do it this way:

df <- data.frame(id = c(1:6),
                 profession = c(1, 5, 4, NA, 0, 5))

profession.code <- c(`0` = 0, Optometrists=1, Accountants=2, Veterinarians=3, 
                     `Financial analysts`=4,  Nurses=5)

df$profession.str <- names(profession.code)[match(df$profession, profession.code)]
df
#   id profession     profession.str
# 1  1          1       Optometrists
# 2  2          5             Nurses
# 3  3          4 Financial analysts
# 4  4         NA               <NA>
# 5  5          0                  0
# 6  6          5             Nurses

Note that I had to add a 0 entry in your profession.code vector to account for those zeroes.

EDIT: here is an updated solution to account for Eric's comment below that the data may contain any number of profession codes for which there are no corresponding descriptions:

match.idx <- match(df$profession, profession.code)
df$profession.str <- ifelse(is.na(match.idx),
                            df$profession,
                            names(profession.code)[match.idx])



回答2:


I played around with it and this is my current solution using the car package.

pLoop <- function(v) paste(profession.code[v],"='", names(profession.code[v]),"';") 
library(car)
df$profession<- recode(df$profession, paste(sapply(1:5, pLoop),collapse=""))

df
# id           profession
#  1         Optometrists 
#  2               Nurses 
#  3   Financial analysts 
#  4                 <NA>
#  5                    0
#  6               Nurses 

Still interest to if anyone have other suggestions for a solution. I would prefer to do it using only the base function in R.




回答3:


I personally like the way the arules package deals with this problem, using the decode function. From the documentation:

library(arules)
data("Adult")

## Example 1: Manual decoding
## get code
iLabels <- itemLabels(Adult)
head(iLabels)

## get undecoded list and decode in a second step
list <- LIST(Adult[1:5], decode = FALSE)
list

decode(list, itemLabels = iLabels)

Advantage is that the package also offers the functions encode and recode. Their respective purpose is straightforward, I believe.



来源:https://stackoverflow.com/questions/10002536/how-do-i-replace-numeric-codes-with-value-labels-from-a-lookup-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!