How can I assign the value from one of two POSIXct columns in a data.frame to a new POSIXct column?

倖福魔咒の 提交于 2020-01-25 19:50:30

问题


I have a data.frame with two columns of type POSIXct, though for every row, only one column will have a value, e.g.,

dd <- data.frame(date1 = c(now(), NA), date2 = c(as.POSIXct(NA), now()))
> dd
                date1               date2
1 2016-05-06 11:30:04                <NA>
2                <NA> 2016-05-06 11:30:04

I would now like to create a third column that will contain the value of whichever column has a non-NA value, i.e., the result should look like

> dd
                date1               date2               date3
1 2016-05-06 11:26:36                <NA> 2016-05-06 11:26:36
2                <NA> 2016-05-06 11:26:36 2016-05-06 11:26:36 

I've tried using ifelse(), but it doesn't work:

> mutate(dd, date3 = ifelse(!is.na(date1), date1, date2))
                date1               date2      date3
1 2016-05-06 11:30:04                <NA> 1462559405
2                <NA> 2016-05-06 11:30:04 1462559405

Neither does logical vector-based assignment:

> dd[!is.na(dd$date1), "date3"] <- dd[!is.na(dd$date1), "date1"]
> dd[!is.na(dd$date2), "date3"] <- dd[!is.na(dd$date2), "date2"]
> dd
                date1               date2      date3
1 2016-05-06 11:30:04                <NA> 1462559405
2                <NA> 2016-05-06 11:30:04 1462559405

Can anyone explain this behavior?

Am I stuck with creating a new data.frame with an empty column of class POSIXct and then assigning into it? This would not be ideal because it breaks the rule of being able to just assign into a data.frame and having it magically work.

Or should I do the assignment and then change the column class afterwards (as suggested in this solution)? This would not be ideal because the conversion to numeric in the course of the assignment drops the timezone, which I would then have to supply again when calling as.POSIXct().

Thanks in advance!


回答1:


An alternative, assume date1 is 'correct' then overwrite with date2 where applicable

dd <- data.frame(date1 = c(now(), NA), date2 = c(as.POSIXct(NA), now()))
dd2 <- dd$date1
dd2[is.na(dd2)]<-dd$date2[is.na(dd2)]



回答2:


The following solution worked for me, although its not very clean code:

dd<-read.csv("dd.csv",stringsAsFactors = F,na.strings = c("", " "))

dd[,1]<-as.POSIXct(dd[,1],"%m/%d/%Y %H:%M",tz = "GMT")
dd[,2]<-as.POSIXct(dd[,2],"%m/%d/%Y %H:%M",tz = "GMT")
dd[,'Date3']<-dd[,1]


dd[which(!is.na(dd$Date1)),'Date3']<-dd$Date1[!is.na(dd$Date1)]
dd[which(!is.na(dd$Date2)),'Date3']<-dd$Date2[!is.na(dd$Date2)]

str(dd)
'data.frame':   6 obs. of  3 variables:
 $ Date1: POSIXct, format: "2016-05-20 11:30:00" ...
 $ Date2: POSIXct, format: NA ...
 $ Date3: POSIXct, format: "2016-05-20 11:30:00" .

sum(is.na(dd$Date3))
[1] 0

The trick I used was to create Date3 using Date1, which in turn means that column's class is POSIXct



来源:https://stackoverflow.com/questions/37079480/how-can-i-assign-the-value-from-one-of-two-posixct-columns-in-a-data-frame-to-a

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!