How to set jdbc/partitionColumn type to Date in spark 2.4.1

后端 未结 4 1946
清歌不尽
清歌不尽 2021-01-05 20:14

I am trying to retrieve data from oracle using spark-sql-2.4.1 version. I tried to set the JdbcOptions as below :

    .option(\"lowerBound\", \"31-MAR-02\");         


        
4条回答
  •  被撕碎了的回忆
    2021-01-05 20:20

    If you are using Oracle, see https://github.com/apache/spark/blob/master/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala#L441

    val df1 = spark.read.format("jdbc")
          .option("url", jdbcUrl)
          .option("dbtable", "datetimePartitionTest")
          .option("partitionColumn", "d")
          .option("lowerBound", "2018-07-06")
          .option("upperBound", "2018-07-20")
          .option("numPartitions", 3)
          // oracle.jdbc.mapDateToTimestamp defaults to true. If this flag is not disabled, column d
          // (Oracle DATE) will be resolved as Catalyst Timestamp, which will fail bound evaluation of
          // the partition column. E.g. 2018-07-06 cannot be evaluated as Timestamp, and the error
          // message says: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff].
          .option("oracle.jdbc.mapDateToTimestamp", "false")
          .option("sessionInitStatement", "ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD'")
          .load()
    

提交回复
热议问题