Update/Replace Values in Dataframe with Tidyverse Join

后端 未结 5 2087
说谎
说谎 2021-01-01 01:29

What is the most efficient way to update/replace NAs in main dataset with (correct) values in a lookup table? This is such a common operation! Similar questions do not seem

5条回答
  •  Happy的楠姐
    2021-01-01 01:49

    If the abbreviation column is complete and the lookup table is complete, could you just drop the state_name column and then join?

    left_join(df1 %>% select(-state_name), lookup_df, by = 'state_abbrev') %>% 
      select(state_abbrev, state_name, value)
    

    Another option could be to use match and if_else in a mutate call using the built in state name and abbreviation lists:

    df1 %>% 
      mutate(state_name = if_else(is.na(state_name), state.name[match(state_abbrev,state.abb)], state_name))
    

    Both give the same output:

    # A tibble: 10 x 3
       state_abbrev state_name  value
                      
     1 AL           Alabama       525
     2 AK           Alaska        719
     3 AZ           Arizona      1186
     4 AR           Arkansas     1051
     5 CA           California    888
     6 CO           Colorado      615
     7 CT           Connecticut   578
     8 DE           Delaware      894
     9 FL           Florida       536
    10 GA           Georgia       599       
    

提交回复
热议问题