问题
This might an easy one. Here's the data:
dat <- read.table(header=TRUE, text="
Seg ID Distance
Seg46 V21 160.37672
Seg72 V85 191.24400
Seg373 V85 167.38930
Seg159 V147 14.74852
Seg233 V171 193.01636
Seg234 V171 200.21458
")
dat
Seg ID Distance
Seg46 V21 160.37672
Seg72 V85 191.24400
Seg373 V85 167.38930
Seg159 V147 14.74852
Seg233 V171 193.01636
Seg234 V171 200.21458
I am intending to get a table like the following that will give me Seg
for the minimized distance (as duplication is seen in ID
.
Seg Crash_ID Distance
Seg46 V21 160.37672
Seg373 V85 167.38930
Seg159 V147 14.74852
Seg233 V171 193.01636
I am trying to use ddply
to solve it; but it is not reaching there.
ddply(dat, "Seg", summarize, min = min(Distance))
Seg min
Seg159 14.74852
Seg233 193.01636
Seg234 200.21458
Seg373 167.38930
Seg46 160.37672
Seg72 191.24400
回答1:
We can subset the rows with which.min
. After grouping with 'ID', we slice
the rows based on the position of minimum 'Distance'.
library(dplyr)
dat %>%
group_by(ID) %>%
slice(which.min(Distance))
A similar option using data.table
would be
library(data.table)
setDT(dat)[, .SD[which.min(Distance)], by = ID]
回答2:
If you prefer ddply
you could do this
library(plyr)
ddply(dat, .(ID), summarize,
Seg = Seg[which.min(Distance)],
Distance = min(Distance))
# ID Seg Distance
#1 V147 Seg159 14.74852
#2 V171 Seg233 193.01636
#3 V21 Seg46 160.37672
#4 V85 Seg373 167.38930
来源:https://stackoverflow.com/questions/32377541/subset-data-based-on-minimum-value