Redshift - Adding timezone offset (Varchar) to timestamp column

不羁岁月 提交于 2021-02-11 13:27:13

问题


as part of ETL to Redshift, in one of the source tables, there are 2 columns: original_timestamp - TIMESTAMP: which is the local time when the record was inserted in whichever region original_timezone_offset - Varchar: which is the offset to UTC

The data looks something like this:

original_timestamp      original_timezone_offset
2011-06-22 11:00:00.000000    -0700
2014-11-29 17:00:00.000000    -0800
2014-12-02 22:00:00.000000    +0900
2011-06-03 09:23:00.000000    -0700
2011-07-28 03:00:00.000000    -0700
2011-05-01 01:30:00.000000    -0700

In my target table, I need to convert this to UTC (using the offset). How do I do it? So far I have tried multiple things but dateadd() seems to be the closest solution. But the problem with dateadd() is, when I say:

SELECT original_timestamp, original_timezone_offset
 ,dateadd(H, original_timezone_offset, original_timestamp) as original_utc_time

it is adding/subtracting '700'/'800' hours instead of 7/8 hrs to the original timestamp because the offset is a VARCHAR and the values are like: -0700 etc.

Did anyone see this issue before? Appreciate any help/inputs. Thanks.


回答1:


Just take the 'hours' part of the offset:

WITH t as (
SELECT  '2011-06-22 11:00:00.000000'::timestamp as original_timestamp, '-0700' as original_timezone_offset
UNION ALL
SELECT '2014-11-29 17:00:00.000000'::timestamp,'-0800'
UNION ALL
SELECT '2014-12-02 22:00:00.000000'::timestamp,'+0900'
)
SELECT
  original_timestamp,
  original_timezone_offset,
  DATEADD(hour, SUBSTRING(original_timezone_offset, 1, 3)::INT, original_timestamp)
FROM t

2011-06-22 11:00:00 -0700   2011-06-22 04:00:00
2014-11-29 17:00:00 -0800   2014-11-29 09:00:00
2014-12-02 22:00:00 +0900   2014-12-03 07:00:00

You'll need some additional fancy code if you have non-full-hour offsets (eg +0730).




回答2:


First, recognize that if your timestamps are already in local time of the given offset, then you need to subtract that offset to convert back to UTC. In that first example you gave, 2011-06-22 11:00:00 -0700 is equivalent to 2011-06-22 18:00:00 UTC.

However, rather than try to add or subtract these values yourself, you should let the AT TIME ZONE function do the work for you. It will create a timestamptz that is in your supplied offset, then you can use it again to convert to UTC.

(Note that you could use the CONVERT_TIMEZONE function instead, but that one is only understood by Redshift, where AT TIME ZONE works on regular PostgreSQL also.)

However, you have is that the time zone offsets you have aren't in a format understood by these functions. See time zone usage notes. So, before we try to convert, let's translate your offset strings to an understood format.

We will want -0700 to become +07:00. The colon is required, and the sign must be flipped because it will be interpreted with the POSIX-style time zone format. In that format, positive values lie west of GMT instead of the usual conventions specified in ISO 8601.

concat(translate(substring(original_timezone_offset, 1, 3), '-+', '+-'),':',substring(original_timezone_offset, 4, 2))

Then we will use that with AT TIME ZONE to do the conversion:

(original_timezone AT TIME ZONE <the above mess>) AT TIME ZONE 'UTC' AS utc_timestamp

Putting it all together...

WITH t as (
SELECT  '2011-06-22 11:00:00.000000'::timestamp as original_timestamp, '-0700' as original_timezone_offset
UNION ALL
SELECT '2014-11-29 17:00:00.000000'::timestamp,'-0800'
UNION ALL
SELECT '2014-12-02 22:00:00.000000'::timestamp,'+0900'
)
SELECT
  original_timestamp,
  original_timezone_offset,
  concat(translate(substring(original_timezone_offset, 1, 3), '-+', '+-'),':',substring(original_timezone_offset, 4, 2)) as modified_timezone_offset,
  (original_timestamp AT TIME ZONE concat(translate(substring(original_timezone_offset, 1, 3), '-+', '+-'),':',substring(original_timezone_offset, 4, 2))) AT TIME ZONE 'UTC' AS utc_timestamptz
FROM t

Output:

2011-06-22 11:00:00  -0700  +07:00  2011-06-22 18:00:00
2014-11-29 17:00:00  -0800  +08:00  2014-11-30 01:00:00
2014-12-02 22:00:00  +0900  -09:00  2014-12-02 13:00:00

SQL Fiddle here.



来源:https://stackoverflow.com/questions/55128396/redshift-adding-timezone-offset-varchar-to-timestamp-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!