Convert date to end of month in Spark

自古美人都是妖i 提交于 2019-12-01 05:12:41

问题


I have a Spark DataFrame as shown below:

#Create DataFrame    
df <- data.frame(name = c("Thomas", "William", "Bill", "John"),
      dates = c('2017-01-05', '2017-02-23', '2017-03-16', '2017-04-08'))
df <- createDataFrame(df)

#Make sure df$dates column is in 'date' format    
df <- withColumn(df, 'dates', cast(df$dates, 'date'))

name    | dates
--------------------
Thomas  |2017-01-05
William |2017-02-23
Bill    |2017-03-16
John    |2017-04-08

I want to change dates to the end of month date, so they would look like shown below. How do I do this? Either SparkR or PySpark code is fine.

name    | dates
--------------------
Thomas  |2017-01-31
William |2017-02-28
Bill    |2017-03-31
John    |2017-04-30

回答1:


You may use the following (PySpark):

from pyspark.sql.functions import last_day

df.select('name', last_day(df.dates).alias('dates')).show()

To clarify, last_day(date) returns the last day of the month of which date belongs to.

I'm pretty sure there is a similar function in sparkR https://spark.apache.org/docs/1.6.2/api/R/last_day.html




回答2:


For completeness, here is the SparkR code:

df <- withColumn(df, 'dates', last_day(df$dates))


来源:https://stackoverflow.com/questions/44686700/convert-date-to-end-of-month-in-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!