dplyr does not group data by date

后端 未结 2 800
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-02 01:00

I am trying to calculate the frequency of bikes that are taken by people using a dataset provided by Leada.

Here is the code:

library(dplyr)

setAs(\"cha         


        
2条回答
  •  南旧
    南旧 (楼主)
    2021-02-02 01:14

    The lubridate package is useful when dealing with dates. Here is the code to parse Start.Date and End.Date, extract week days, then group by week days:

    Read dates as character vectors

    library(dplyr)
    library(lubridate)
    # For some reason your instruction to load the csv directly from a url
    # didn't work. I save the csv to a temporary directory.
    d <- read.csv("/tmp/bike_trip_data.csv", colClasses = c("numeric", "numeric", "character", "factor", "numeric", "character", "factor", "numeric", "numeric", "factor", "character"), stringsAsFactors = T)
    
    names(d)[9] <- "BikeNo"
    d <- tbl_df(d)
    

    Use lubridate to convert start date and end date

    d <- d %>% 
      mutate(
        Start.Date = parse_date_time(Start.Date,"%m/%d/%y %H:%M"),
        End.Date = parse_date_time(End.Date,"%m/%d/%y %H:%M"),
        Weekday = wday(Start.Date, label=TRUE, abbr=FALSE))
    

    Number of lines per week day

    d %>%
      group_by(Weekday) %>%
      summarise(Total = n())
    
    #     Weekday Total
    # 1    Sunday 10587
    # 2    Monday 23138
    # 3   Tuesday 24678
    # 4 Wednesday 23651
    # 5  Thursday 25265
    # 6    Friday 24283
    # 7  Saturday 12413
    

提交回复
热议问题