Fill missing date values in column by adding delivery interval to another date column

前端 未结 1 822
刺人心
刺人心 2021-01-16 21:15

Data:

DB1 <- data.frame(orderItemID  = 1:10,     
orderDate = c(\"2013-01-21\",\"2013-03-31\",\"2013-04-12\",\"2013-06-01\",\"2014-01-01\", \"2014-02-19\"         


        
相关标签:
1条回答
  • 2021-01-16 21:58

    You want to do date-arithmetic, and fill NAs in deliveryDate column by adding a date-interval of two days to orderDate column. lubridate supplies convenience functions for time intervals like days(), weeks(), months(), years(), hours(), minutes(), seconds() for exactly that purpose. And first, you have to parse your (European-format) datestrings into R date objects.

    Something like the following, using lubridate for date-arithmetic and dplyr for dataframe manipulation:

    require(dplyr)
    
    DB1$orderDate    = as.POSIXct(DB1$orderDate, format="%d.%m.%y", tz='UTC')
    DB1$deliveryDate = as.POSIXct(DB1$deliveryDate, format="%d.%m.%y", tz='UTC')
    
    DB1 %>% group_by(orderDate) %>%
            summarize(delivery_time = (deliveryDate - orderDate)) %>%
            ungroup() %>% summarize(median(delivery_time, na.rm=T))
    
    # median(delivery_time, na.rm = T)
    #                         1.5 days
    # so you round up to 2 days
    delivery_days = 2.0
    
    require(lubridate)
    DB1 <- DB1 %>% filter(is.na(deliveryDate)) %>%
                    mutate(deliveryDate = orderDate + days(2))
    
    # orderItemID  orderDate deliveryDate
    #           3 2013-04-12   2013-04-14
    #           6 2014-02-19   2014-02-21
    
    0 讨论(0)
提交回复
热议问题