Return 0 to second instance of duplicate

旧城冷巷雨未停 提交于 2020-08-09 18:10:20

问题


I have a similar data set to the following:

A  B   C 
1  10  5 
1  20  1
2  30  1
2  30  1

I'd like to add a column returning 1, until we hit a duplicate of A & B, when I need to return a 0, but only for the second instance, so:

A  B   C  D
1  10  5  1
1  20  1  1
2  30  1  1
2  30  1  0

Any help appreciated.


回答1:


An option would be

df$D <- as.integer(!duplicated(df[c("A", "B")]))
df$D
#[1] 1 1 1 0
 



回答2:


Just a doodle with library(dplyr):

df %>% group_by(A,B) %>% mutate(D = +((1:n())==1))

Or if you want it to be zero "only for the second instance", meaning the third instance would be also one, then the following works:

df %>% group_by(A,B) %>% mutate(D = +!((1:n())==2))

In the example your duplicates are not for A and B only but also C. If that's actually the case, you can use group_by_all instead of group_by(A,B).



来源:https://stackoverflow.com/questions/56366368/return-0-to-second-instance-of-duplicate

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!