R how to convert timestamps into multiple timezones in the same column

瘦欲@ 提交于 2019-12-25 08:15:40

问题


I have a dataframe that contains two character variables: one is a timestamp and the other is a US state. I have been unsuccessfully trying to convert each timestamp to a POSIX object, with time zone assigned according to the corresponding value for state: Eastern Time (EST) for Florida (FL) and Central Time (CST6CDT) for Texas (TX).
However, no matter what I try, R will only return either all of the time stamps in a single time zone or else as a string containing the number of seconds since the origin. I can of course convert the string to a POSIX object but I come full circle and cannot declare multiple timezones. I also tried this with a loop instead of indexing but that doesn't work either.

I would be particularly keen to understand what's going on. My guess (perhaps incorrect) is that the problem is to do with the data type declared in a column, as this could explain what happens in Example 3. But, no matter what I've read or attempted, I cannot find out how to get this to work.

Example 1 - Converting df$time to a POSIX object and then trying to assign different timezones by df$state

 df <- data.frame(time = c("2010-03-05 07:03:00", "2010-03-05 16:00:00", "2010-03-06 07:01:00"), state = c("FL", "FL", "TX"))
df$time <- as.character(df$time); df$state <- as.character(df$state)
df$time <- as.POSIXct (strptime(df$time, "%Y-%m-%d %H:%M:%S"))
df$time  
#-----
#[1] "2010-03-05 07:03:00 PST" "2010-03-05 16:00:00 PST" "2010-03-06 07:01:00 PST"

df$time has successfully been converted to a POSIX object. But when I try to assign time zone by state, the time zone remains in whatever state it was initialized (in my location, that's PST).

 df$time[df$state == "FL"] <- as.POSIXct (strptime(df$time[df$state == "FL"], "%Y-%m-%d %H:%M:%S"), tz = "EST" 
 df$time[df$state == "TX"] <- as.POSIXct (strptime(df$time[df$state == "TX"], "%Y-%m-%d %H:%M:%S"), tz = "CST6CDT")
 df$time
#[1] "2010-03-05 04:03:00 PST" "2010-03-05 13:00:00 PST" "2010-03-06 05:01:00 PST"

Example 2 - Trying to convert the df$time string directly to each state's time zone without first converti

ng the string to a POSIX object

 df <- data.frame(time = c("2010-03-05 07:03:00", "2010-03-05 16:00:00", "2010-03-06 07:01:00"), state = c("FL", "FL", "TX"))
 df$time <- as.character(df$time); df$state <- as.character(df$state)
 df$time
#[1] "2010-03-05 07:03:00" "2010-03-05 16:00:00" "2010-03-06 07:01:00"
 df$time[df$state == "FL"] <- as.POSIXct (strptime(df$time[df$state == "FL"], "%Y-%m-%d %H:%M:%S"), tz = "EST")

df$time[df$state == "TX"] <- as.POSIXct (strptime(df$time[df$state == "TX"], "%Y-%m-%d %H:%M:%S"), tz = "CST6CDT")
df$time
[1] "1267790580" "1267822800" "1267880460"

Example 3 - Although I can take the df$time strings produced by the code in Example 2 and successfully convert them to EST...

 as.POSIXct(as.numeric(df$time[df$state == "FL"]), origin = "1970-01-01", tz = "EST")

#[1] "2010-03-05 07:03:00 EST" "2010-03-05 16:00:00 EST"

# ... but if I try to pass those objects back to the dataframe, R converts them back to a string and I come full circle.

 df$time[df$state == "FL"] <- as.POSIXct(as.numeric(df$time[df$state == "FL"]), origin = "1970-01-01", tz = "EST")
 df$time
#[1] "1267790580" "1267822800" "1267880460"

回答1:


Based on the comments, R cannot handle multiple timezones in a single vector. So if anybody else is trying to solve the same problem as I was, I can offer a crude but effective workaround.

First, create separate vectors for each time zone and set the POSIX objects to local time in each, then merge the separate vectors into a new vector, with the POSIX objects set to UTC/GMT (or other single time zone of your choosing).

df <- data.frame(time = c("2010-03-05 07:03:00", "2010-03-05 16:00:00", "2010-03-05 08:27:00"), state = c("FL", "FL", "TX"))

df$time <- as.character(df$time); df$state <- as.character(df$state)

df$timeFL[df$state == "FL"] <- as.POSIXct (strptime(df$time[df$state == "FL"], "%Y-%m-%d %H:%M:%S"), tz = "EST")

df$timeFL <- as.POSIXct(df$timeFL, origin = "1970-01-01", tz = "EST")

df$timeTX[df$state == "TX"] <- as.POSIXct (strptime(df$time[df$state == "TX"], "%Y-%m-%d %H:%M:%S"), tz = "CST6CDT")

df$timeTX <- as.POSIXct(df$timeTX, origin = "1970-01-01", tz = "CST6CDT")

df$common.time.UTC[!is.na(df$timeFL)] <- df$timeFL[!is.na(df$timeFL)]

df$common.time.UTC[!is.na(df$timeTX)] <- df$timeTX[!is.na(df$timeTX)]

df$common.time.UTC <- as.POSIXct(df$common.time.UTC, origin = "1970-01-01", tz = "UTC")

df$timeFL <- NULL; df$timeTX <- NULL

df

time                state     common.time.UTC
2010-03-05 07:03:00    FL 2010-03-05 12:03:00
2010-03-05 16:00:00    FL 2010-03-05 21:00:00
2010-03-05 08:27:00    TX 2010-03-05 14:27:00


来源:https://stackoverflow.com/questions/40514448/r-how-to-convert-timestamps-into-multiple-timezones-in-the-same-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!