Lubridate as_date and. as_datetime differences in behavior

僤鯓⒐⒋嵵緔 提交于 2021-02-10 06:41:33

问题


I have a vector of numerics representing the number of milliseconds since January 1, 1970. I would like to convert these into a date time object using lubridate. A sample of data is below:

raw_times <- c(1139689917479, 1139667123031, 1140364113915, 1140364951003, 
               1139643685434, 1139677091970, 1139691963511, 1140339448413, 1140368308429, 
               1139686613641, 1139666081813, 1140351488730, 1140346617958, 1141933663183, 
               1141933207579, 1140360125149, 1140351845108, 1140365079103, 1141933549825, 
               1140365601476)

Knowing that the documentation for as_date and as_datetime indicate they take a numeric vector representing the number of days since January 1, 1970 , I tried the following:

library(lubridate)

as_date(raw_times / (1000 * 60 * 60 * 24))
"2006-02-11" "2006-02-11" "2006-02-19" "2006-02-19" "2006-02-11" 
"2006-02-11" "2006-02-11" "2006-02-19" "2006-02-19" "2006-02-11" 
"2006-02-11" "2006-02-19" "2006-02-19" "2006-03-09" "2006-03-09"
"2006-02-19" "2006-02-19" "2006-02-19" "2006-03-09" "2006-02-19"

(Obviously using the fact that there are 1000ms in a second, 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day.)

When I run the same the code with as_datetime, I get the following:

as_datetime(raw_times / (1000 * 60 * 60 * 24))
"1970-01-01 03:39:50 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:50 UTC"
"1970-01-01 03:39:50 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:58 UTC"
"1970-01-01 03:39:58 UTC" "1970-01-01 03:40:16 UTC" "1970-01-01 03:40:16 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC"
"1970-01-01 03:40:16 UTC" "1970-01-01 03:39:58 UTC"

The results are different. I would assume there is some other argument which I am missing, but I can't find anything in the documentation which would tell me what that would be.

Session information below:

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lubridate_1.6.0

loaded via a namespace (and not attached):
[1] magrittr_1.5  tools_3.3.2   stringi_1.1.2 stringr_1.1.0

回答1:


Not a (package name redacted) solution, but you can do this with base::.POSIXct:

R> options(digits.secs=3)
R> .POSIXct(raw_times/1000)
 [1] "2006-02-11 14:31:57.479 CST" "2006-02-11 08:12:03.030 CST"
 [3] "2006-02-19 09:48:33.914 CST" "2006-02-19 10:02:31.003 CST"
 [5] "2006-02-11 01:41:25.434 CST" "2006-02-11 10:58:11.970 CST"
 [7] "2006-02-11 15:06:03.510 CST" "2006-02-19 02:57:28.413 CST"
 [9] "2006-02-19 10:58:28.428 CST" "2006-02-11 13:36:53.641 CST"
[11] "2006-02-11 07:54:41.812 CST" "2006-02-19 06:18:08.730 CST"
[13] "2006-02-19 04:56:57.957 CST" "2006-03-09 13:47:43.183 CST"
[15] "2006-03-09 13:40:07.578 CST" "2006-02-19 08:42:05.148 CST"
[17] "2006-02-19 06:24:05.108 CST" "2006-02-19 10:04:39.102 CST"
[19] "2006-03-09 13:45:49.825 CST" "2006-02-19 10:13:21.476 CST"



回答2:


Another solution is to use the relatively recent anytime package whose mission is to convert anything to proper Date or POSIXct objects with minimal fuzz or input.

anytime() also takes the (properly scaled) seconds since epoch:

R> raw_times <- c(1139689917479, 1139667123031, 1140364113915,
+                1140364951003, 1139643685434, 1139677091970,
+                1139691963511, 1140339448413, 1140368308429,
+                1139686613641, 1139666081813, 1140351488730,
+                1140346617958, 1141933663183, 1141933207579,
+                1140360125149, 1140351845108, 1140365079103,
+                1141933549825, 1140365601476)
R> scaled_times <- raw_times / 1000
R> library(anytime)
R> options(digits.secs=6)   # subsecond display
R> anytime(scaled_times)           
 [1] "2006-02-11 14:31:57.479 CST"
 [2] "2006-02-11 08:12:03.030 CST"
 [3] "2006-02-19 09:48:33.914 CST"
 [4] "2006-02-19 10:02:31.003 CST"
 [5] "2006-02-11 01:41:25.434 CST"
 [6] "2006-02-11 10:58:11.970 CST"
 [7] "2006-02-11 15:06:03.510 CST"
 [8] "2006-02-19 02:57:28.413 CST"
 [9] "2006-02-19 10:58:28.428 CST"
[10] "2006-02-11 13:36:53.641 CST"
[11] "2006-02-11 07:54:41.812 CST"
[12] "2006-02-19 06:18:08.730 CST"
[13] "2006-02-19 04:56:57.957 CST"
[14] "2006-03-09 13:47:43.183 CST"
[15] "2006-03-09 13:40:07.578 CST"
[16] "2006-02-19 08:42:05.148 CST"
[17] "2006-02-19 06:24:05.108 CST"
[18] "2006-02-19 10:04:39.102 CST"
[19] "2006-03-09 13:45:49.825 CST"
[20] "2006-02-19 10:13:21.476 CST"
R> 

Using anytime() is slight overkill (as Josh showed) but then again it may be preferable to use an exposed function rather than a hidden base function. And anytime() wins over the official as.POSIXct() by not requiring the origin (again and again).



来源:https://stackoverflow.com/questions/40959726/lubridate-as-date-and-as-datetime-differences-in-behavior

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!