Apache Spark - how to set timezone to UTC? currently defaulted to Zulu

前端 未结 4 2005
花落未央
花落未央 2020-12-20 04:06

In Spark\'s WebUI (port 8080) and on the environment tab there is a setting of the below:

user.timezone Zulu

Do you know how/where I can override this to U

4条回答
  •  不知归路
    2020-12-20 04:34

    In some cases you will also want to set the JVM timezone. For example, when loading data into a TimestampType column, it will interpret the string in the local JVM timezone. To set the JVM timezone you will need to add extra JVM options for the driver and executor:

    spark = pyspark.sql.SparkSession \
        .Builder()\
        .appName('test') \
        .master('local') \
        .config('spark.driver.extraJavaOptions', '-Duser.timezone=GMT') \
        .config('spark.executor.extraJavaOptions', '-Duser.timezone=GMT') \
        .config('spark.sql.session.timeZone', 'UTC') \
        .getOrCreate()
    

    We do this in our local unit test environment, since our local time is not GMT.

    Useful reference: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones

提交回复
热议问题