Why I get null results from date_format() PySpark function?

前端 未结 1 1596
广开言路
广开言路 2020-12-04 02:45

Suppose there is a dateframe with a column comprised of dates as strings. For that assumption, we create the following dataFrame as an example:



        
相关标签:
1条回答
  • 2020-12-04 03:25

    It doesn't work because your data is not a valid ISO 8601 representation and cast to date returns NULL:

    sqlContext.sql("SELECT CAST('12-21-1991' AS DATE)").show()
    ## +----+
    ## | _c0|
    ## +----+
    ## |null|
    ## +----+
    

    You'll have to parse data first using custom format:

    output_format = ...  # Some SimpleDateFormat string
    df.select(date_format(
        unix_timestamp("dates1", "MM-dd-yyyy").cast("timestamp"), 
        output_format
    ))
    
    0 讨论(0)
提交回复
热议问题