I am using PySpark. I have a column (\'dt\') in a dataframe (\'canon_evt\') that this a timestamp. I am trying to remove seconds from a DateTime value. It is originally read in
truncate the time stamp to some other minutes say 5 minutes or 10 mins or 7 min
from pyspark.sql.functions import *
df = spark.createDataFrame([("2016-03-11 09:00:07", 1, 5),("2016-03-11 09:00:57", 2, 5)]).toDF("date", "val","val2")
w = df.groupBy('val',window("date", "5 seconds")).agg(sum("val1").alias("sum"))
w.select(w.window.start.cast("string").alias("start"),w.window.end.cast("string").alias("end"), "sum", "val").show(10, False)