Dealing with numeric (decimal) dates in R?

安稳与你 提交于 2019-12-02 16:04:28

问题


I have some numeric date data from Excel:

> df <- data.frame(c(42613, 42612, 42614), c(42614.61, 42613.97, 42612.12))
> names(df) <- c("Closetime", "Opentime")

Both Closetime and Opentime are numeric. I want to preserve the hour/minute/second data from OpenTime and add the time '00:00:00' to every date in Closetime:

> df$Closetime <- paste(as.Date(df$Closetime, origin = '1900-01-01'), c('00:00:00'))

Easy enough to do Closetime, but when I try using lubridate::date_decimal on Opentime, everything goes to hell.

> df$Opentime <- date_decimal(df$Opentime)
> df
            Closetime             Opentime
1 2016-09-02 00:00:00 42614-08-11 15:36:00
2 2016-09-01 00:00:00 42613-12-21 01:12:00
3 2016-09-03 00:00:00 42612-02-13 22:04:48

How can I get both Opentime and Closetime to be of the same type/format? I am eventually looking to be able to get the difference, in hours, between the times in each column, for reference.


回答1:


If you can use POSIXct, you can do for example

df$Opentime <- as.POSIXct( df$Opentime*24*60*60, 
                origin="1900-01-01", 
                tz="UTC")

Reasoning: POSIXct is just number of seconds since origin




回答2:


First we follow the advice in help("as.Date") regarding Excel dates. I assume here Windows Excel:

df$Closetime_p <- as.Date(df$Closetime, origin = "1899-12-30")
df$Opentime_p <- as.Date(floor(df$Opentime), origin = "1899-12-30")

Then we convert to POSIXct:

df$Closetime_p <- as.POSIXct(as.POSIXlt(df$Closetime_p, tz = "GMT"), tz = "GMT")
df$Opentime_p <- as.POSIXct(as.POSIXlt(df$Opentime_p, tz = "GMT"), tz = "GMT")

Now we add the time:

df$Opentime_p <- df$Opentime_p + (df$Opentime - floor(df$Opentime)) * 24 * 3600
#  Closetime Opentime Closetime_p          Opentime_p
#1     42613 42614.61  2016-08-31 2016-09-01 14:38:24
#2     42612 42613.97  2016-08-30 2016-08-31 23:16:48
#3     42614 42612.12  2016-09-01 2016-08-30 02:52:48



回答3:


Convert to date as you do, then convert to POSIXct:

First, create the data.frame (and note how we set the column names):

R> df <- data.frame(CloseT=c(42613, 42612, 42614), OpenT=c(42614.61, 42613.97, 42612.12))
R> df
  CloseT   OpenT
1  42613 42614.6
2  42612 42614.0
3  42614 42612.1
R> 

Then convert to Date:

R> df$CloseT <- as.Date(df$CloseT, origin="1900-01-01")
R> df$OpenT <- as.Date(df$OpenT, origin="1900-01-01")
R> df
      CloseT      OpenT
1 2016-09-02 2016-09-03
2 2016-09-01 2016-09-02
3 2016-09-03 2016-09-01
R>

Finally, convert to POSIXct:

R> df$OpenT <- as.POSIXct(df$OpenT)
R> df$CloseT <- as.POSIXct(df$CloseT)
R> df
               CloseT               OpenT
1 2016-09-01 19:00:00 2016-09-03 09:38:24
2 2016-08-31 19:00:00 2016-09-02 18:16:48
3 2016-09-02 19:00:00 2016-08-31 21:52:48
R> 

Going via POSIXlt allows you to set a timezone, as Roland showed.




回答4:


Check the documentation on date_decimal:

a POSIXct object, whose year corresponds to the integer part of decimal.

date <- ymd("2009-02-10")
decimal <- decimal_date(date)  # 2009.11
date_decimal(decimal) # "2009-02-10 UTC"

So in your example, it's interpreting 42614 as the year.

Try using as.POSIXct. You may have to specify the time zone, but if all you need is the delta this won't be necessary. Below I've calculated the time difference:

df <- data.frame(c(42613, 42612, 42614), c(42614.61, 42613.97, 42612.12))
names(df) <- c("Closetime", "Opentime")
df$Closetime <- as.POSIXct(as.Date(df$Closetime, origin = '1900-01-01'))
df$Opentime <- as.POSIXct(as.Date(df$Opentime, origin = '1900-01-01'))
df$delta <- df$Opentime - df$Closetime
df
            Closetime            Opentime      delta
1 2016-09-01 20:00:00 2016-09-03 10:38:24  1.61 days
2 2016-08-31 20:00:00 2016-09-02 19:16:48  1.97 days
3 2016-09-02 20:00:00 2016-08-31 22:52:48 -1.88 days

Based on the comment, if you want to make sure the display has the correct hour, you'll need to match timezones correctly. You can do this after the conversion to as.POSIXct by setting the tzone attribute.

df <- data.frame(c(42613, 42612, 42614), c(42614.61, 42613.97, 42612.12))
names(df) <- c("Closetime", "Opentime")
df$Closetime <- as.POSIXct(as.Date(df$Closetime, origin = '1900-01-01'))
df$Opentime <- as.POSIXct(as.Date(df$Opentime, origin = '1900-01-01'))
attr(df$Closetime, "tzone") <- "GMT"
attr(df$Opentime, "tzone") <- "GMT"
df$delta <- df$Opentime - df$Closetime
df

   Closetime            Opentime      delta
1 2016-09-02 2016-09-03 14:38:24  1.61 days
2 2016-09-01 2016-09-02 23:16:48  1.97 days
3 2016-09-03 2016-09-01 02:52:48 -1.88 days


来源:https://stackoverflow.com/questions/39798770/dealing-with-numeric-decimal-dates-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!