I am using PySpark. I have a column (\'dt\') in a dataframe (\'canon_evt\') that this a timestamp. I am trying to remove seconds from a DateTime value. It is originally read in
This question was asked a few years ago, but if anyone else comes across it, as of Spark v2.3 this has been added as a feature. Now this is as simple as (assumes canon_evt
is a dataframe with timestamp column dt
that we want to remove the seconds from)
from pyspark.sql.functions import date_trunc
canon_evt = canon_evt.withColumn('dt', date_trunc('minute', canon_evt.dt))