How to calculate business hours between two dates when business hours vary depending on the day in R?

穿精又带淫゛_ 提交于 2021-02-08 08:39:18

问题


I'm trying to calculate business hours between two dates. Business hours vary depending on the day.

Weekdays have 15 business hours (8:00-23:00), saturdays and sundays have 12 business hours (9:00-21:00).

For example: start date 07/24/2020 22:20 (friday) and end date 07/25/2020 21:20 (saturday), since I'm only interested in the business hours the result should be 12.67hours.

Here an example of the dataframe and desired output:

start_date            end_date            business_hours
07/24/2020 22:20     07/25/2020 21:20        12.67
07/14/2020 21:00     07/16/2020 09:30        18.50
07/18/2020 08:26     07/19/2020 10:00        13.00
07/10/2020 08:00     07/13/2020 11:00        42.00

 

回答1:


Here is something you can try with lubridate. I edited another function I had I thought might be helpful.

First create a sequence of dates between the two dates of interest. Then create intervals based on business hours, checking each date if on the weekend or not.

Then, "clamp" the start and end times to the allowed business hours time intervals using pmin and pmax.

You can use time_length to get the time measurement of the intervals; summing them up will give you total time elapsed.

library(lubridate)
library(dplyr)

calc_bus_hours <- function(start, end) {
  my_dates <- seq.Date(as.Date(start), as.Date(end), by = "day")
  
  my_intervals <- if_else(weekdays(my_dates) %in% c("Saturday", "Sunday"),
    interval(ymd_hm(paste(my_dates, "09:00"), tz = "UTC"), ymd_hm(paste(my_dates, "21:00"), tz = "UTC")),
    interval(ymd_hm(paste(my_dates, "08:00"), tz = "UTC"), ymd_hm(paste(my_dates, "23:00"), tz = "UTC")))

  int_start(my_intervals[1]) <- pmax(pmin(start, int_end(my_intervals[1])), int_start(my_intervals[1]))
  int_end(my_intervals[length(my_intervals)]) <- pmax(pmin(end, int_end(my_intervals[length(my_intervals)])), int_start(my_intervals[length(my_intervals)]))
  
  sum(time_length(my_intervals, "hour"))
}

calc_bus_hours(as.POSIXct("07/24/2020 22:20", format = "%m/%d/%Y %H:%M", tz = "UTC"), as.POSIXct("07/25/2020 21:20", format = "%m/%d/%Y %H:%M", tz = "UTC"))
[1] 12.66667

Edit: For Spanish language, use c("sábado", "domingo") instead of c("Saturday", "Sunday")

For the data frame example, you can use mapply to call the function using the two selected columns as arguments. Try:

df$business_hours <- mapply(calc_bus_hours, df$start_date, df$end_date)

                start                 end business_hours
1 2020-07-24 22:20:00 2020-07-25 21:20:00       12.66667
2 2020-07-14 21:00:00 2020-07-16 09:30:00       18.50000
3 2020-07-18 08:26:00 2020-07-19 10:00:00       13.00000
4 2020-07-10 08:00:00 2020-07-13 11:00:00       42.00000


来源:https://stackoverflow.com/questions/63139030/how-to-calculate-business-hours-between-two-dates-when-business-hours-vary-depen

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!