Add Number of days column to Date Column in same dataframe for Spark Scala App

前端 未结 2 853
我在风中等你
我在风中等你 2020-12-02 01:55

I have a dataframe df of columns (\"id\", \"current_date\", \"days\") and I am trying to add the the \"days\" to \"

相关标签:
2条回答
  • 2020-12-02 02:23

    No need to use an UDF, you can do it using an SQL expression:

    val newDF = df.withColumn("new_date", expr("date_add(current_date,days)"))
    
    0 讨论(0)
  • 2020-12-02 02:24

    A small custom udf can be used to make this date arithmetic possible.

    import org.apache.spark.sql.functions.udf
    import java.util.concurrent.TimeUnit
    import java.util.Date
    import java.text.SimpleDateFormat    
    
    val date_add = udf((x: String, y: Int) => {
        val sdf = new SimpleDateFormat("yyyy-MM-dd")
        val result = new Date(sdf.parse(x).getTime() + TimeUnit.DAYS.toMillis(y))
      sdf.format(result)
    } )
    

    Usage:

    scala> val df = Seq((1, "2017-01-01", 10), (2, "2017-01-01", 20)).toDF("id", "current_date", "days")
    df: org.apache.spark.sql.DataFrame = [id: int, current_date: string, days: int]
    
    scala> df.withColumn("new_Date", date_add($"current_date", $"days")).show()
    +---+------------+----+----------+
    | id|current_date|days|  new_Date|
    +---+------------+----+----------+
    |  1|  2017-01-01|  10|2017-01-11|
    |  2|  2017-01-01|  20|2017-01-21|
    +---+------------+----+----------+
    
    0 讨论(0)
提交回复
热议问题