问题
I want to group by color and calculate the date range for that color. I have tried group_by()
, summarize()
and aggregate()
.
#Data:
df1 <- as.Date(c('Jul 1', 'Jun 26', 'July 5', 'July 15'), format = '%B %d')
df2 <- c("red", "blue", "red", "blue")
df1 <- data.frame(df1,df2)
What I'm trying to get:
# Group.1 x
[1] 4 red
[2] 19 blue
I have been trying this:
df <- aggregate(df1[,1], list(df1[,2]), as.numeric(max(df1[,1]) - min(df1[,1]), units="days"))
I have tested as.numeric(max(df1[,1]) - min(df1[,1]), units="days")
and it returns the value that I'm looking for, I just can't figure out how to return that value for each color.
My Error Message is below, but I think realistically, I'm just going about this the wrong way.
Error in match.fun(FUN) :
'as.numeric(max(df1$date) - min(df1$date), units = "days")' is not a function, character or symbol
after reading through aggregate()
document I tried to use the formula =
for the last argument and returned this error:
Error in match.fun(FUN) : argument "FUN" is missing, with no default
回答1:
With dplyr
:
df1 %>%
group_by(df2) %>%
summarise(Range=max(df1) - min(df1))
# A tibble: 2 x 2
df2 Range
<fct> <drtn>
1 blue 19 days
2 red 4 days
回答2:
Using aggregate
aggregate(df1~ df2, df1, function(x) diff(range(x)))
Note that the column names of 'df1' are 'df1' and 'df2' and it creates some confusion. Instead, it may be better to create the data ass
df1 <- data.frame(x = df1, Group = df2)
and then with the formula method,
aggregate(x~ Group, df1, diff)
回答3:
require(dplyr)
df001 <- as.Date(c('Jul 1', 'Jun 26', 'July 5', 'July 15'), format = '%B %d')
df002 <- c("red", "blue", "red", "blue")
df003 <- data.frame(df001,df002)
df003 %>% rename(dates = df001, colors = df002) %>%
group_by(colors) %>%
summarise(min_date = min(dates), max_date = max(dates)) %>%
mutate(range = max_date - min_date) %>%
select(colors, range)
#
# # A tibble: 2 x 2
# colors range
# <fct> <time>
# 1 blue 19
# 2 red 4
来源:https://stackoverflow.com/questions/57113810/return-date-range-by-group