Remove consecutive duplicates from dataframe

岁酱吖の 提交于 2019-11-30 06:49:34

Here's a way, not with rle, but a way none-the-less:

dat[with(dat, c(TRUE, diff(as.numeric(interaction(v1, v2))) != 0)), ]

This assumes you're using factor columns, as your sample data implies.

Here a fast solution using filter

dat[(filter(dat,c(-1,1))!= 0)[,1],]
     v1   v2
1     A  Jan
3     E  May
4     B  Feb
7     A  Jan
8     D  Apr
10    A  Mar
11    B  Feb
12    E  May
15    B  Feb
18    C  Mar
19    D  Apr
NA <NA> <NA>

You need to add the last value of the original data to the result.

Using rle I came up with this

ind <- cumsum(rle(as.character(dat$v1))$length)
dat[ind, ]

ind indicates either the first or the last of consecutive entries.

EDIT:

A simple solution to Matthews comment would be

dat[15, 2] <- "May"
dat[cumsum(rle(paste0(dat$v1, dat$v2))$length), ]
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!