I have a column of time stamps in character format that looks like this:
2015-09-24 06:00:00 UTC
2015-09-24 05:00:00 UTC
dateTimeZone <- c
A data.table solution:
library(data.table)
data <- data.table(dateTimeZone=c("2015-09-24 06:00:00 UTC",
"2015-09-24 05:00:00 America/Los_Angeles"))
data[, timezone:=tstrsplit(dateTimeZone, split=" ")[[3]]]
data[, datetime.local:=as.POSIXct(dateTimeZone, tz=timezone), by=timezone]
data[, datetime.utc:=format(datetime.local, tz="UTC")]
The key thing is to split the data on the timezone field so that you can feed each set of timezones to as.POSIXct separately (I'm not really sure why as.POSIXct won't let you give it a vector of timezones, actually). Here I make use of data.table's efficient split-apply-combine syntax, but you could apply the same general idea with base R or using dplyr.
Another way using lubridate...
library(stringr)
library(lubridate)
normalize.timezone <- function(dates, target_tz = local.timezone) {
tzones <- str_split(dates, ' ')
tzones <- lapply(tzones, '[', 3)
tzones <- unlist(tzones)
dts <- str_replace_all(dates, ' [\\w\\-\\/\\+]+$', '')
tmp <- lapply(1:length(dates), function(i) {
with_tz(as.POSIXct(dts[ i ], tz = tzones[ i ]), target_tz)
})
final <- unlist(tmp)
attributes(final) <- attributes(tmp[[ 1 ]])
final
}
dates <- c('2019-01-06 23:00:00 MST',
'2019-01-22 14:00:00 America/Los_Angeles',
'2019-01-05 UTC-4',
'2019-01-15 15:00:00 Europe/Moscow')
(normalize.timezone(dates, 'EST'))
You can get there by checking each row and processing accordingly, and then putting everything back into a consistent UTC time. (#edited to now include matching the timezone abbreviations to the full timezone specification)
dates <- c(
"2015-09-24 06:00:00 UTC",
"2015-09-24 05:00:00 PDT"
)
#extract timezone from dates
datestz <- vapply(strsplit(dates," "), tail, 1, FUN.VALUE="")
## Make a master list of abbreviation to
## full timezone names. Used an arbitrary summer
## and winter date to try to catch daylight savings timezones.
tzabbrev <- vapply(
OlsonNames(),
function(x) c(
format(as.POSIXct("2000-01-01",tz=x),"%Z"),
format(as.POSIXct("2000-07-01",tz=x),"%Z")
),
FUN.VALUE=character(2)
)
tmp <- data.frame(Olson=OlsonNames(), t(tzabbrev), stringsAsFactors=FALSE)
final <- unique(data.frame(tmp[1], abbrev=unlist(tmp[-1])))
## Do the matching:
out <- Map(as.POSIXct, dates, tz=final$Olson[match(datestz,final$abbrev)])
as.POSIXct(unlist(out), origin="1970-01-01", tz="UTC")
# 2015-09-24 06:00:00 UTC 2015-09-24 05:00:00 PDT
#"2015-09-24 06:00:00 GMT" "2015-09-24 12:00:00 GMT"