R: how to aggregate by real values column with given error tolerance

我与影子孤独终老i 提交于 2019-12-11 14:06:20

问题


Assuming I have a data frame:

t <- data.frame(d1=c( 694, 695, 696, 2243, 2244, 2651, 2652 ),
                d2=c(1.80950881, 1.80951007, 1.80951052, 1.46499982, 1.46500087, 1.14381419, 1.14381319 ))

    d1       d2
1  694 1.809509
2  695 1.809510
3  696 1.809511
4 2243 1.465000
5 2244 1.465001
6 2651 1.143814
7 2652 1.143813

I'd like to group by the column d2 real values that have very close but not exactly equal values. Thus, in this example, after aggregation, I'd like to obtain the following data set:

    d1       d2
1  694 1.809509
2 2243 1.465000
3 2652 1.143813

taking the row with minimum d2 value from each group.

Using the aggregate function, my first attempt:

aggregate(t, by=list(t$d2), FUN=min)
   Group.1   d1       d2
1 1.143813 2652 1.143813
2 1.143814 2651 1.143814
3 1.465000 2243 1.465000
4 1.465001 2244 1.465001
5 1.809509  694 1.809509
6 1.809510  695 1.809510
7 1.809511  696 1.809511

is far from reaching my goal.

How can I tell aggregate to group not by exact equality, but by equality with provided error tolerance?


回答1:


Here is an approach with tidyverse:

library(tidyverse)
t %>%
  group_by(round(d2, 1)) %>% #group by rounded d2
  filter(d2 == min(d2)) %>% #filter min d1 in each group
  ungroup() %>% #ungroup so you can remove the grouping column
  select(-3)



回答2:


This work with your toy data i don't know with real one, you might have to round to more or less digits

aggregate(t, by=list(round(t$d2,4)), FUN=min)


来源:https://stackoverflow.com/questions/48955848/r-how-to-aggregate-by-real-values-column-with-given-error-tolerance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!