Remove/collapse consecutive duplicate values in sequence

前端 未结 4 1163
一向
一向 2020-11-27 19:08

I have the following dataframe:

a a a b c c d e a a b b b e e d d

The required result should be

a b c d e a b e d         


        
4条回答
  •  春和景丽
    2020-11-27 19:38

    library(dplyr)
    x <- c("a", "a", "a", "b", "c", "c", "d", "e", "a", "a", "b", "b", "b", "e", "e", "d", "d")
    x[x!=lag(x, default=1)]
    #[1] "a" "b" "c" "d" "e" "a" "b" "e" "d"
    

    EDIT: For data.frame

      mydf <- data.frame(
        V1 = c("a", "a", "a", "b", "c", "c", "d", "e", 
             "a", "a", "b", "b", "e", "e", "d", "d"),
        V2 = c(1, 2, 3, 2, 4, 1, 3, 9, 
             4, 8, 10, 199, 2, 5, 4, 10),
       stringsAsFactors=FALSE)
    

    dplyr solution is one liner:

    mydf %>% filter(V1!= lag(V1, default="1"))
    #  V1 V2
    #1  a  1
    #2  b  2
    #3  c  4
    #4  d  3
    #5  e  9
    #6  a  4
    #7  b 10
    #8  e  2
    #9  d  4
    

    post scriptum

    lead(x,1) suggested by @Carl Witthoft iterates in reverse order.

    leadit<-function(x) x!=lead(x, default="what")
    rows <- leadit(mydf[ ,1])
    mydf[rows, ]
    
    #   V1  V2
    #3   a   3
    #4   b   2
    #6   c   1
    #7   d   3
    #8   e   9
    #10  a   8
    #12  b 199
    #14  e   5
    #16  d  10
    

提交回复
热议问题