Return 0 to second instance of duplicate

问题

I have a similar data set to the following:

I'd like to add a column returning 1, until we hit a duplicate of A & B, when I need to return a 0, but only for the second instance, so:

A  B   C  D
1  10  5  1
1  20  1  1
2  30  1  1
2  30  1  0

Any help appreciated.

回答1:

An option would be

df$D <- as.integer(!duplicated(df[c("A", "B")]))
df$D
#[1] 1 1 1 0

回答2:

Just a doodle with library(dplyr):

df %>% group_by(A,B) %>% mutate(D = +((1:n())==1))

Or if you want it to be zero "only for the second instance", meaning the third instance would be also one, then the following works:

df %>% group_by(A,B) %>% mutate(D = +!((1:n())==2))

In the example your duplicates are not for A and B only but also C. If that's actually the case, you can use group_by_all instead of group_by(A,B).

来源：https://stackoverflow.com/questions/56366368/return-0-to-second-instance-of-duplicate

标签

dataframe

duplicates

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!