Avoiding type conflicts with dplyr::case_when

后端 未结 2 950

I am trying to use dplyr::case_when within dplyr::mutate to create a new variable where I set some values to missing and recode other values simult

相关标签:
2条回答
  • 2020-12-13 18:53

    Try this ?

    df %>% dplyr::mutate(new = dplyr::case_when(.$old == 1 ~ 5,
                                                      .$old == 2 ~ NA_real_,
                                                      TRUE~.$old))
    
    > df
      old new
    1   1   5
    2   2  NA
    3   3   3
    
    0 讨论(0)
  • 2020-12-13 18:56

    As said in ?case_when:

    All RHSs must evaluate to the same type of vector.

    You actually have two possibilities:

    1) Create new as a numeric vector

    df <- df %>% mutate(new = case_when(old == 1 ~ 5,
                                        old == 2 ~ NA_real_,
                                        TRUE ~ as.numeric(old)))
    

    Note that NA_real_ is the numeric version of NA, and that you must convert old to numeric because you created it as an integer in your original dataframe.

    You get:

    str(df)
    # 'data.frame': 3 obs. of  2 variables:
    # $ old: int  1 2 3
    # $ new: num  5 NA 3
    

    2) Create new as an integer vector

    df <- df %>% mutate(new = case_when(old == 1 ~ 5L,
                                        old == 2 ~ NA_integer_,
                                        TRUE ~ old))
    

    Here, 5L forces 5 into the integer type, and NA_integer_ is the integer version of NA.

    So this time new is integer:

    str(df)
    # 'data.frame': 3 obs. of  2 variables:
    # $ old: int  1 2 3
    # $ new: int  5 NA 3
    
    0 讨论(0)
提交回复
热议问题