Suppose there is a dateframe with a column comprised of dates as strings. For that assumption, we create the following dataFrame as an example:
It doesn't work because your data is not a valid ISO 8601 representation and cast to date returns NULL
:
sqlContext.sql("SELECT CAST('12-21-1991' AS DATE)").show()
## +----+
## | _c0|
## +----+
## |null|
## +----+
You'll have to parse data first using custom format:
output_format = ... # Some SimpleDateFormat string
df.select(date_format(
unix_timestamp("dates1", "MM-dd-yyyy").cast("timestamp"),
output_format
))