Change column value in a dataframe spark scala

元气小坏坏 提交于 2019-12-13 10:33:51

问题


This is how my dataframe looks like at the moment

+------------+
|    DATE    |
+------------+
|    19931001|
|    19930404|
|    19930603|
|    19930805|
+------------+

I am trying to reformat this string value to yyyy-mm-dd hh:mm:ss.fff and keep it as a string not a date type or time stamp.

How would I do that using the withColumn method ?


回答1:


Here is the solution using UDF and withcolumn, I have assumed that you have a string date field in Dataframe

//Create dfList dataframe
  val dfList = spark.sparkContext
    .parallelize(Seq("19931001","19930404", "19930603", "19930805")).toDF("DATE")


  dfList.withColumn("DATE", dateToTimeStamp($"DATE")).show()

  val dateToTimeStamp = udf((date: String) => {
    val stringDate = date.substring(0,4)+"/"+date.substring(4,6)+"/"+date.substring(6,8)
    val format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss")
    format.format(new SimpleDateFormat("yyy/MM/dd").parse(stringDate))
  })



回答2:


withClumn("date",
      from_unixtime(unix_timestamp($"date", "yyyyMMdd"), "yyyy-MM-dd hh:mm:ss.fff") as "date")

this should work. Another notice is the that mm gives minutes and MM gives months, hope this help you.




回答3:


First, I created this DF:

val df = sc.parallelize(Seq("19931001","19930404","19930603","19930805")).toDF("DATE")

For date management we are going to use joda time Library (don't forget to join the joda-time.jar file)

import org.joda.time.format.DateTimeFormat
import org.joda.time.format.DateTimeFormatter 

def func(s:String):String={ 
val dateFormat = DateTimeFormat.forPattern("yyyymmdd");
val resultDate = dateFormat.parseDateTime(s);
return resultDate.toString();
}

Finally, apply the function to dataframe:

val temp = df.map(l => func(l.get(0).toString()))
val df2 = temp.toDF("DATE")
df2.show()

This answer still needs some work, me myself is new to spark, but it is getting the job done, I think!



来源:https://stackoverflow.com/questions/44018393/change-column-value-in-a-dataframe-spark-scala

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!