My actual data looks like:
8/8/2013 15:10
7/26/2013 10:30
7/11/2013 14:20
3/28/2013 16:15
3/18/2013 15:50
When I read this from the excel f
Maybe it is a matter of how R reads the data. Just an example here with lubridate
seems to work well.
x <- "8/8/2013 15:10"
library(lubridate)
dmy_hm(x, tz = "GMT")
[1] "2013-08-08 15:10:00 GMT"
The problem is that either R of Excel is rounding the number to two decimals. When you convert the for example the cell with 8/8/2013 15:10
to text formatting (in Excel on Mac OSX), you get the number 41494.63194
.
When you use:
as.POSIXct(41494.63194*86400, origin="1899-12-30",tz="GMT")
it will give you:
[1] "2013-08-08 15:09:59 GMT"
This is 1 second off from the original date (which is also an indication that 41494.63194
is rounded to five decimals).
Probably the best solution to do is export your excel-file to a .csv
or a tab-separated .txt
file and then read it into R. This gives me at least the correct dates:
> df
datum
1 8/8/2013 15:10
2 7/26/2013 10:30
3 7/11/2013 14:20
4 3/28/2013 16:15
5 3/18/2013 15:50
This is how it works over here on a Windows system. This is what a source Excel 2010 file looks like:
date num secs constant Rtime
(mm/dd/yyyy) (in Excel) (num*86400) (Windows) (secs-constant)
08/08/2013 15:10 41494.63 3585136200 2209161600 1375974600
07/26/2013 10:30 41481.44 3583996200 2209161600 1374834600
11/07/2013 14:20 41585.60 3592995600 2209161600 1383834000
03/28/2013 16:15 41361.68 3573648900 2209161600 1364487300
03/18/2013 15:50 41351.66 3572783400 2209161600 1363621800
Rtime <- c(1375974600,1374834600,1383834000,1364487300,1363621800)
as.POSIXct(Rtime,origin="1970-01-01",tz="GMT")
#[1] "2013-08-08 15:10:00 GMT" "2013-07-26 10:30:00 GMT"
#[3] "2013-11-07 14:20:00 GMT" "2013-03-28 16:15:00 GMT"
#[5] "2013-03-18 15:50:00 GMT"
Why this constant? Firstly, because Excel and Office generally is a mess when dealing with dates. Seriously, look over here: Why is 1899-12-30 the zero date in Access / SQL Server instead of 12/31?
2209161600
is the difference in seconds between the POSIXct
start of 1970-01-01 and 1899-12-30, which is the 0 point in Excel on Windows.
dput(as.POSIXct(2209161600,origin="1899-12-30",tz="GMT"))
#structure(0, tzone = "GMT", class = c("POSIXct", "POSIXt"))
Given
x <- c("8/8/2013 15:10","7/26/2013 10:30","7/11/2013 14:20","3/28/2013 16:15","3/18/2013 15:50")
(which is read as a character vector),
try
x <- as.POSIXct(x, format = "%m/%d/%Y %H:%M", tz = "GMT")
It reads correctly as a POSIXct vector to me.