PySpark 1.5 How to Truncate Timestamp to Nearest Minute from seconds

后端 未结 4 1313
予麋鹿
予麋鹿 2021-02-07 21:11

I am using PySpark. I have a column (\'dt\') in a dataframe (\'canon_evt\') that this a timestamp. I am trying to remove seconds from a DateTime value. It is originally read in

4条回答
  •  自闭症患者
    2021-02-07 22:10

    truncate the time stamp to some other minutes say 5 minutes or 10 mins or 7 min

    from pyspark.sql.functions import *
    df = spark.createDataFrame([("2016-03-11 09:00:07", 1, 5),("2016-03-11 09:00:57", 2, 5)]).toDF("date", "val","val2")
    w = df.groupBy('val',window("date", "5 seconds")).agg(sum("val1").alias("sum"))
    w.select(w.window.start.cast("string").alias("start"),w.window.end.cast("string").alias("end"), "sum", "val").show(10, False)
    

提交回复
热议问题