问题
I am new to both Java and Apache spark and trying to understand the timestamp and timezone usage. I would like all the timestamps to be stored in EST timezone in SQL Server from data i get from apache spark DF.
When I use current_timestamp, I am getting the correct EST time but the offset i am getting when i look at data is '+00:00' instead of '-04:00'.
Here is a value stored in database that is passed in from spark dataset:
2020-04-07 11:36:23.0220 +00:00
From what I see current_timestamp does not accept any timezone. Moreover, the time is correct (it is in EST) but i don't understand why the offset is wrong.
Any help to understand this would be great.
回答1:
Java Timestamps work more or less as LocalDateTime in Java - they don't contain timezone information. And the database is interpreting this as UTC timestamp. That's why you got a mismatch. I usually use two approaches (depending what suits better)
- You can return UTC timestamp from Spark (with simple custom UDF) instead of using
current_timestampwhich is timezone specific. - You can encode your dates as Strings - similarly, using
java.timeAPI you can achieve that with simple udf
Hope things are a bit clearer now.
来源:https://stackoverflow.com/questions/61084442/getting-correct-offset-for-timezone-using-current-timestamp-in-apache-spark