I have the below pyspark df which can be recreated by the code
df = spark.createDataFrame([(1, "John Doe", "2020-11-30"),(2, "John Do