问题
For a dataset like this
Incident.ID.. date product
INCFI0000029582 2014-09-25 08:39:45 foo
INCFI0000029582 2014-09-25 08:39:48 bar
INCFI0000029582 2014-09-25 08:40:44 foo
INCFI0000029582 2014-10-10 23:04:00 foo
INCFI0000029587 2014-09-25 08:33:32 bar
INCFI0000029587 2014-09-25 08:34:41 bar
INCFI0000029587 2014-09-25 08:35:24 bar
INCFI0000029587 2014-10-10 23:04:00 foo
df <- structure(list(Incident.ID.. = c("INCFI0000029582", "INCFI0000029582",
"INCFI0000029582", "INCFI0000029582", "INCFI0000029587", "INCFI0000029587",
"INCFI0000029587", "INCFI0000029587"), date = c("2014-09-25 08:39:45",
"2014-09-25 08:39:48", "2014-09-25 08:40:44", "2014-10-10 23:04:00",
"2014-09-25 08:33:32", "2014-09-25 08:34:41", "2014-09-25 08:35:24",
"2014-10-10 23:04:00"), product =
c("foo","bar","foo","foo","bar","bar","bar","foo")),
class = "data.frame", row.names = c(NA,
-8L))
I am calculating the rolling difference in time by id using the mutate function as follows
library(dplyr)
library(lubridate)
df1 <- df %>%
group_by(Incident.ID..) %>%
mutate(diff = c(0, diff(ymd_hms(date))))
This creates a column diff as follows
Incident.ID.. date product diff
INCFI0000029582 2014-09-25 08:39:45 foo 0
INCFI0000029582 2014-09-25 08:39:48 bar 3
INCFI0000029582 2014-09-25 08:40:44 foo 56
INCFI0000029582 2014-10-10 23:04:00 foo 1347796
INCFI0000029587 2014-09-25 08:33:32 bar 0
INCFI0000029587 2014-09-25 08:34:41 bar 69
INCFI0000029587 2014-09-25 08:35:24 bar 43
INCFI0000029587 2014-10-10 23:04:00 foo 1348116
Now my goal is to aggregate/collapse rows from zero to zero, with the expected final dataset like this
Incident.ID.. DateMin DateMax product
INCFI0000029582 2014-09-25 08:39:45 2014-10-10 23:04:00 foo,bar,foo,foo
INCFI0000029587 2014-09-25 08:33:32 2014-10-10 23:04:00 bar,bar,bar,foo
I am not sure how to collapse rows as shown above with a min and max date column , I need help. Thanks in advance.
回答1:
The group_by attribute remains after the mutate, so we summarise by the group to get the min, max of 'the 'date' and collapse the 'product' by pasteing the elements together (toString is a convenient wrapper for paste(., collapse=", "))
df %>%
group_by(Incident.ID..) %>%
mutate(diff = c(0, diff(ymd_hms(date)))) %>%
summarise(DateMin = min(date),
DateMax = max(date),
product = toString(product))
# A tibble: 2 x 4
# Incident.ID.. DateMin DateMax product
# <chr> <chr> <chr> <chr>
#1 INCFI0000029582 2014-09-25 08:39:45 2014-10-10 23:04:00 foo, bar, foo, foo
#2 INCFI0000029587 2014-09-25 08:33:32 2014-10-10 23:04:00 bar, bar, bar, foo
来源:https://stackoverflow.com/questions/54425618/collapse-rows-from-0-to-0