I have date-time pairs in a csv file that look like
11/4/2012
in one column and
12:06:08 AM
in the neighboring column. They are recorded in local time (i.e., they switch to PST and PDT at the appropriate times), but there is no tz or DST indicator in the file. The only visible way to detect that is that the sequence of times does funny things. For example, on November 4, 2012, I have a sequence of times like
12:51:20 AM 1:13:08 AM 1:24:58 AM 1:40:28 AM 1:48:08 AM 1:54:08 AM 1:56:58 AM 1:04:28 AM 1:05:48 AM 1:07:18 AM 1:15:00 AM 1:39:08 AM 2:05:38 AM
PST presumably begins with the 1:04:28 AM reading, but there is no indicator.
Is there a straightforward approach to assigning time zones properly (presumably using lubridate)? The file is long, so I'd rather not loop through one reading at a time, as I fear that could take some time. I'll have to do the same thing in reverse for the spring.
This isn't possible. There's no way to know with certainty that "11/4/2012 1:04:28 AM"
is PST and not actually an observation between "11/4/2012 12:51:20 AM"
and "11/4/2012 1:13:08 AM"
PDT.
If you're certain the observations are ordered in the file, you could convert them to POSIXt
and take the diff
of the vector. Any negative values will be DST changes. You may miss some, however, if the time between observations across a DST change is greater than 1 hour.
Lines <- "11/4/2012 12:51:20 AM
11/4/2012 01:13:08 AM
11/4/2012 01:24:58 AM
11/4/2012 01:40:28 AM
11/4/2012 01:48:08 AM
11/4/2012 01:54:08 AM
11/4/2012 01:56:58 AM
11/4/2012 01:04:28 AM
11/4/2012 01:05:48 AM
11/4/2012 01:07:18 AM
11/4/2012 01:15:00 AM
11/4/2012 01:39:08 AM
11/4/2012 02:05:38 AM"
x <- scan(con <- textConnection(Lines), what="", sep="\n")
close(con)
diff(strptime(x, format="%m/%d/%Y %I:%M:%S %p"))
# Time differences in mins
# [1] 21.800000 11.833333 15.500000 7.666667 6.000000 2.833333
# [7] -52.500000 1.333333 1.500000 7.700000 24.133333 86.500000
来源:https://stackoverflow.com/questions/15014757/is-there-a-way-to-assign-dst-transitions-automatically-in-lubridate