How can I use purrr to match records from a lookup table?

送分小仙女□ 提交于 2020-01-04 05:33:30

问题


I have this dataset

library(dplyr)
data_frame(Q1= c('AL', NA, 'TX', 'FL'), Q2=c('MN', 'CO', NA, NA), value=c(10,24,12,54)) 
# A tibble: 4 x 3
     Q1    Q2 value
  <chr> <chr> <dbl>
1    AL    MN    10
2  <NA>    CO    24
3    TX  <NA>    12
4    FL  <NA>    54

And I am trying to use purrr to convert the values in Q1 and Q2 into full state names using a lookup table

lktState <- data_frame(abb=state.abb, name=state.name)

So far I've tried this but it doesn't work

data_frame(Q1= c('AL', NA, 'TX', 'FL'), Q2=c('MN', 'CO', NA, NA), value=c(10,24,12,54)) %>% 
  mutate_at(vars('Q1','Q2'), purrr::map(.x = ., lktState$name[match(.x, lktState$abb)]))

Error in match(.x, lktState$abb) : object '.x' not found


回答1:


base R version (which can be vectorized but this illustrates the concept):

xdf <- data.frame(
  Q1= c('AL', NA, 'TX', 'FL'),
  Q2 = c('MN', 'CO', NA, NA),
  value = c(10, 24, 12, 54),
  stringsAsFactors=FALSE
) -> xdf

xdf
##     Q1   Q2 value
## 1   AL   MN    10
## 2 <NA>   CO    24
## 3   TX <NA>    12
## 4   FL <NA>    54
lktState <- setNames(state.name, state.abb)

xdf$Q1 <- lktState[xdf$Q1]
xdf$Q2 <- lktState[xdf$Q2]

xdf
##        Q1        Q2 value
## 1 Alabama Minnesota    10
## 2    <NA>  Colorado    24
## 3   Texas      <NA>    12
## 4 Florida      <NA>    54

"tidyverse"

library(dplyr)

xdf <- data_frame(
  Q1= c('AL', NA, 'TX', 'FL'),
  Q2 = c('MN', 'CO', NA, NA),
  value = c(10, 24, 12, 54)
) -> xdf

xdf
## # A tibble: 4 x 3
##      Q1    Q2 value
##   <chr> <chr> <dbl>
## 1    AL    MN    10
## 2  <NA>    CO    24
## 3    TX  <NA>    12
## 4    FL  <NA>    54
lktState <- setNames(state.name, state.abb)

mutate_at(xdf, .vars=vars(-value), .funs=funs(lktState[.]))
## # A tibble: 4 x 3
##        Q1        Q2 value
##     <chr>     <chr> <dbl>
## 1 Alabama Minnesota    10
## 2    <NA>  Colorado    24
## 3   Texas      <NA>    12
## 4 Florida      <NA>    54

There's no need to use "apply"-like idioms to do this basic lookup table assignment.




回答2:


I agree with Sotos that a join is the natural way to do this. However, your purrr solution is definitely fixable.

You are missing three things,

  1. For anything other than a simple single function, you need to use funs in mutate_at.
  2. map functions use ~ notation for anonymous functions.
  3. You don't want to return a list, but rather a character vector, so use _chr variant.

.

mutate_at(df,
          vars('Q1', 'Q2'), 
          funs(purrr::map_chr(.x = ., ~lktState$name[match(.x, lktState$abb)])))

Gives:

# A tibble: 4 x 3
       Q1        Q2 value
    <chr>     <chr> <dbl>
1 Alabama Minnesota    10
2    <NA>  Colorado    24
3   Texas      <NA>    12
4 Florida      <NA>    54

Data

df <- data_frame(Q1= c('AL', NA, 'TX', 'FL'), Q2=c('MN', 'CO', NA, NA), value=c(10,24,12,54))


来源:https://stackoverflow.com/questions/45117199/how-can-i-use-purrr-to-match-records-from-a-lookup-table

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!