Better way to convert a string field into timestamp in Spark

前端 未结 7 878
独厮守ぢ
独厮守ぢ 2020-11-27 16:29

I have a CSV in which a field is datetime in a specific format. I cannot import it directly in my Dataframe because it needs to be a timestamp. So I import it as string and

7条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-27 17:05

    I have ISO8601 timestamp in my dataset and I needed to convert it to "yyyy-MM-dd" format. This is what I did:

    import org.joda.time.{DateTime, DateTimeZone}
    object DateUtils extends Serializable {
      def dtFromUtcSeconds(seconds: Int): DateTime = new DateTime(seconds * 1000L, DateTimeZone.UTC)
      def dtFromIso8601(isoString: String): DateTime = new DateTime(isoString, DateTimeZone.UTC)
    }
    
    sqlContext.udf.register("formatTimeStamp", (isoTimestamp : String) => DateUtils.dtFromIso8601(isoTimestamp).toString("yyyy-MM-dd"))
    

    And you can just use the UDF in your spark SQL query.

提交回复
热议问题