Removing duplicate words in a string in R

后端 未结 4 1094
栀梦
栀梦 2020-12-11 03:47

Just to help someone who\'s just voluntarily removed their question, following a request for code he tried and other comments. Let\'s assume they tried something like this:

4条回答
  •  暖寄归人
    2020-12-11 04:25

    To remove duplicate words except for any special characters. use this function

    rem_dup_word <- function(x){
    x <- tolower(x)
    paste(unique(trimws(unlist(strsplit(x,split=" ",fixed=F,perl=T)))),collapse = 
    " ")
    }
    

    Input data:

    duptest <- "Samsung WA80E5LEC samsung Top Loading with Diamond Drum, 6 kg 
    (Silver)"
    
    rem_dup_word(duptest)
    

    output:samsung wa80e5lec top loading with diamond drum 6 kg (silver).

    It will treat "Samsung" and "SAMSUNG" as duplicate

提交回复
热议问题