Convert string with form “MM/dd/yyyy HH:mm” to joda datetime in dataframe in Spark

我只是一个虾纸丫 提交于 2019-12-10 09:37:23

问题


I'm reading in csv-files with in one column a string that should be converted to a datetime. The string is in the form MM/dd/yyyy HH:mm. However when I try to transform this using joda-time, I always get the error:

Exception in thread "main" java.lang.UnsupportedOperationException: Schema for type org.joda.time.DateTime is not supported

I don't know what exactly the problem is...

 val input = c.textFile("C:\\Users\\AAPL.csv").map(_.split(",")).map{p => 
      val formatter: DateTimeFormatter = DateTimeFormat.forPattern("MM/dd/yyyy HH:mm");
      val date: DateTime = formatter.parseDateTime(p(0));
      StockData(date, p(1).toDouble, p(2).toDouble, p(3).toDouble, p(4).toDouble, p(5).toInt, p(6).toInt)
}.toDF()

Anybody who can help?


回答1:


I don't know what exactly the problem is...

Well, the source of the problem is pretty much described by an error message. Spark SQL doesn't support Joda-Time DateTime as an input. A valid input for a date field is java.sql.Date (see Spark SQL and DataFrame Guide, Data Types for reference).

The simplest solution is to adjust StockData class so it takes java.sql.Data as an argument and replace:

val date: DateTime = formatter.parseDateTime(p(0))

with something like this:

val date: java.sql.Date = new java.sql.Date(
  formatter.parseDateTime(p(0)).getMillis)

or

val date: java.sql.Timestamp = new java.sql.Timestamp(
  formatter.parseDateTime(p(0)).getMillis)

if you want to preserve hour / minutes.

If you think about using window functions with range clause a better option is to pass string to a DataFrame and convert it to an integer timestamp:

import org.apache.spark.sql.functions.unix_timestamp

df.withColumn("ts", unix_timestamp($"date", "MM/dd/yyyy HH:mm"))

See Spark Window Functions - rangeBetween dates for details.



来源:https://stackoverflow.com/questions/33688945/convert-string-with-form-mm-dd-yyyy-hhmm-to-joda-datetime-in-dataframe-in-spa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!