Extract original and duplicate result(s) from a data frame in R [duplicate]

问题

I use duplicate results to estimate the measurement uncertainty for chemical analyses. When I extract data from the laboratory database it consists largely of single results but with some samples tested twice, some more than twice (I have seen up to 12). I want to discard all the single analyses and just retain the duplicated results, but including the original result.

The samples are identified by a sample number that is common to the duplicate samples.

I can pull out the duplicates using duplicated() but how to I retain the first result as well?

Thanks.

回答1:

> dat <- data.frame(
    id = sample(1:5, 10, replace = TRUE),
    x = rnorm(10)
    )

> dat
##    id          x
## 1   1  0.7060512
## 2   4  0.6804117
## 3   2  0.2395902
## 4   2  1.5352574
## 5   1  0.2376593
## 6   4  0.8019506
## 7   1 -1.0506505
## 8   5  1.0554555
## 9   3  0.3637685
## 10  5 -0.8404215
> dat[duplicated(dat$id) | duplicated(dat$id, fromLast = TRUE),]
##    id          x
## 1   1  0.7060512
## 2   4  0.6804117
## 3   2  0.2395902
## 4   2  1.5352574
## 5   1  0.2376593
## 6   4  0.8019506
## 7   1 -1.0506505
## 8   5  1.0554555
## 10  5 -0.8404215

来源：https://stackoverflow.com/questions/21359904/extract-original-and-duplicate-results-from-a-data-frame-in-r

标签

duplicates

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!