Convert string with form “MM/dd/yyyy HH:mm” to joda datetime in dataframe in Spark

让人想犯罪 __ 提交于 2019-12-05 19:37:47
zero323

I don't know what exactly the problem is...

Well, the source of the problem is pretty much described by an error message. Spark SQL doesn't support Joda-Time DateTime as an input. A valid input for a date field is java.sql.Date (see Spark SQL and DataFrame Guide, Data Types for reference).

The simplest solution is to adjust StockData class so it takes java.sql.Data as an argument and replace:

val date: DateTime = formatter.parseDateTime(p(0))

with something like this:

val date: java.sql.Date = new java.sql.Date(
  formatter.parseDateTime(p(0)).getMillis)

or

val date: java.sql.Timestamp = new java.sql.Timestamp(
  formatter.parseDateTime(p(0)).getMillis)

if you want to preserve hour / minutes.

If you think about using window functions with range clause a better option is to pass string to a DataFrame and convert it to an integer timestamp:

import org.apache.spark.sql.functions.unix_timestamp

df.withColumn("ts", unix_timestamp($"date", "MM/dd/yyyy HH:mm"))

See Spark Window Functions - rangeBetween dates for details.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!