Replace NA´s in dates with another date

六月ゝ 毕业季﹏ 提交于 2019-12-20 05:21:52

问题


Data:

DB1 <- data.frame(orderItemID  = 1:10,     
orderDate = c("2013-01-21","2013-03-31","2013-04-12","2013-06-01","2014-01-01", "2014-02-19","2014-02-27","2014-10-02","2014-10-31","2014-11-21"),  
deliveryDate = c("2013-01-23", "2013-03-01", "NA", "2013-06-04", "2014-01-03", "NA", "2014-02-28", "2014-10-04", "2014-11-01", "2014-11-23"))

Expected Outcome:

   DB1 <- data.frame(orderItemID  = 1:10,     
 orderDate= c("2013-01-21","2013-03-31","2013-04-12","2013-06-01","2014-01-01", "2014-02-19","2014-02-27","2014-10-02","2014-10-31","2014-11-21"),  
deliveryDate = c("2013-01-23", "2013-03-01", "2013-04-14", "2013-06-04", "2014-01-03", "2014-02-21", "2014-02-28", "2014-10-04", "2014-11-01", "2014-11-23"))

My question is similar to another one I posted: so don´t be confused. As you can see above I have some missing values in the delivery dates and I want to replace them by another date. That date should be the order date of the specific item + the average delivery time in (full) days.(2days) The average delivery time is the time calculated from the average value of all samples that do not contain Missing values = (2days+1day+3days+2days+1day+2days+1day+2days):8=1,75

So I want to replace the NA in delivery time with the order date +2days. When there´s no NA, the date should stay the same.

I tried this already (with lubridate), but it´s not working :(

DB1$deliveryDate[is.na(DB1$deliveryDate) ] <- DB1$orderDate + days(2)

Can someone plz help me?


回答1:


Assuming that you have entered your data like this (note that NAs are not enclosed in quotes so they are read as NAs and not "NA")...

DB1 <- data.frame(orderItemID  = 1:10,     
  orderDate = c("2013-01-21","2013-03-31","2013-04-12","2013-06-01","2014-01-01", "2014-02-19","2014-02-27","2014-10-02","2014-10-31","2014-11-21"),  
  deliveryDate = c("2013-01-23", "2013-03-01", NA, "2013-06-04", "2014-01-03", NA, "2014-02-28", "2014-10-04", "2014-11-01", "2014-11-23"),
  stringsAsFactors = FALSE)

...and, per Nicola's answer, done this to get the formatting right...

DB1[,2:3]<-lapply(DB1[,2:3],as.Date)

...this also works:

library(lubridate)
DB1$deliveryDate <- with(DB1, as.Date(ifelse(is.na(deliveryDate), orderDate + days(2), deliveryDate), origin = "1970-01-01"))

Or you could use dplyr and pipe it:

library(lubridate)
library(dplyr)
DB2 <- DB1 %>%
  mutate(deliveryDate = ifelse(is.na(deliveryDate), orderDate + days(2), deliveryDate)) %>%
  mutate(deliveryDate = as.Date(.[,"deliveryDate"], origin = "1970-01-01"))



回答2:


First, convert the columns to Date objects:

DB1[,2:3]<-lapply(DB1[,2:3],as.Date)

Then, replace the NA elements:

DB1$deliveryDate[is.na(DB1$deliveryDate)] <- 
       DB1$orderDate[is.na(DB1$deliveryDate)] +
       mean(difftime(DB1$orderDate,DB1$deliveryDate,units="days"),na.rm=TRUE)
#   orderItemID  orderDate deliveryDate
#1            1 2013-01-21   2013-01-23
#2            2 2013-03-31   2013-03-01
#3            3 2013-04-12   2013-04-14
#4            4 2013-06-01   2013-06-04
#5            5 2014-01-01   2014-01-03
#6            6 2014-02-19   2014-02-21
#7            7 2014-02-27   2014-02-28
#8            8 2014-10-02   2014-10-04
#9            9 2014-10-31   2014-11-01
#10          10 2014-11-21   2014-11-23 



回答3:


You can do:

DB1 =cbind(DB1$orderItemID,as.data.frame(lapply(DB1[-1], as.character)))

days = round(mean(DB1$deliveryDate-DB1$orderDate, na.rm=T))
mask = is.na(DB1$deliveryDate)

DB1$deliveryDate[mask] = DB1$orderDate[mask]+days

#   DB1$orderItemID  orderDate deliveryDate
#1                1 2013-01-21   2013-01-23
#2                2 2013-03-31   2013-04-01
#3                3 2013-04-12   2013-04-14
#4                4 2013-06-01   2013-06-04
#5                5 2014-01-01   2014-01-03
#6                6 2014-02-19   2014-02-21
#7                7 2014-02-27   2014-02-28
#8                8 2014-10-02   2014-10-04
#9                9 2014-10-31   2014-11-01
#10              10 2014-11-21   2014-11-23

I re-arrange your data since they were not clean:

DB1 <- data.frame(orderItemID  = 1:10,     
orderDate = c("2013-01-21","2013-03-31","2013-04-12","2013-06-01","2014-01-01", "2014-02-19","2014-02-27","2014-10-02","2014-10-31","2014-11-21"),  
deliveryDate = c("2013-01-23", "2013-04-01", NA, "2013-06-04", "2014-01-03", NA, "2014-02-28", "2014-10-04", "2014-11-01", "2014-11-23"))


来源:https://stackoverflow.com/questions/31490669/replace-na%c2%b4s-in-dates-with-another-date

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!