Cleaning up factor levels (collapsing multiple levels/labels)

后端 未结 10 2083
礼貌的吻别
礼貌的吻别 2020-11-22 14:27

What is the most effective (ie efficient / appropriate) way to clean up a factor containing multiple levels that need to be collapsed? That is, how to combine two or more fa

10条回答
  •  庸人自扰
    2020-11-22 14:48

    First let's note that in this specific case we can use partial matching:

    x <- c("Y", "Y", "Yes", "N", "No", "H")
    y <- c("Yes","No")
    x <- factor(y[pmatch(x,y,duplicates.ok = TRUE)])
    # [1] Yes  Yes  Yes  No   No   
    # Levels: No Yes
    

    In a more general case I'd go with dplyr::recode:

    library(dplyr)
    x <- c("Y", "Y", "Yes", "N", "No", "H")
    y <- c(Y="Yes",N="No")
    x <- recode(x,!!!y)
    x <- factor(x,y)
    # [1] Yes  Yes  Yes  No   No   
    # Levels: Yes No
    

    Slightly altered if the starting point is a factor:

    x <- factor(c("Y", "Y", "Yes", "N", "No", "H"))
    y <- c(Y="Yes",N="No")
    x <- recode_factor(x,!!!y)
    x <- factor(x,y)
    # [1] Yes  Yes  Yes  No   No   
    # Levels: Yes No
    

提交回复
热议问题