Spark SQL converting string to timestamp

妖精的绣舞 提交于 2019-11-29 11:36:09

问题


I'm new to Spark SQL and am trying to convert a string to a timestamp in a spark data frame. I have a string that looks like '2017-08-01T02:26:59.000Z' in a column called time_string

My code to convert this string to timestamp is

CAST (time_string AS Timestamp)

But this gives me a timestamp of 2017-07-31 19:26:59

Why is it changing the time? Is there a way to do this without changing the time?

Thanks for any help!


回答1:


You could use unix_timestamp function to convert the utc formatted date to timestamp

val df2 = Seq(("a3fac", "2017-08-01T02:26:59.000Z")).toDF("id", "eventTime")

df2.withColumn("eventTime1", unix_timestamp($"eventTime", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'").cast(TimestampType))

Output:

+-------------+---------------------+
|userid       |eventTime            |
+-------------+---------------------+
|a3fac        |2017-08-01 02:26:59.0|
+-------------+---------------------+

Hope this helps!




回答2:


Solution on Java

There are some Spark SQL functions which let you to play with the date format.

Conversion example : 20181224091530 -> 2018-12-24 09:15:30

Solution (Spark SQL statement) :

SELECT
 ...
 to_timestamp(cast(DECIMAL_DATE as string),'yyyyMMddHHmmss') as `TIME STAMP DATE`,
 ...
FROM some_table

You can use the SQL statements by using an instance of org.apache.spark.sql.SparkSession. For example if you want to execute an sql statement, Spark provide the following solution:

...
// You have to create an instance of SparkSession
sparkSession.sql(sqlStatement); 
...

Notes:

  • You have to convert the decimal to string and after you can achieve the parsing to timestamp format
  • You can play with the format the get however format you want...


来源:https://stackoverflow.com/questions/45558499/spark-sql-converting-string-to-timestamp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!