Cant find uuid in org.apache.spark.sql.types.DataTypes

大憨熊 提交于 2019-12-11 01:29:23

问题


We have a PostgreSQL table which has UUID as one of the column. How do we send UUID field in Spark dataset(using Java) to PostgreSQL DB. We are not able to find uuid field in org.apache.spark.sql.types.DataTypes.

Please advice.


回答1:


Yes, you are right, there is no UUID datatype in SparkSQL. Treating them as String should work because the connector will convert the String to UUID.

I haven't tried with PostgreSQL, but when I used Cassandra (and Scala) it worked perfectly.




回答2:


As already pointed out, despite these resolved issues (10186, 5753) there is still no supported uuid Postgres data type as of Spark 2.3.0.

However, there's a workaround by using Spark's SaveMode.Append and setting the Postgres JDBC property to allow string types to be inferred. In short, it works like:

val props = Map(
      JDBCOptions.JDBC_DRIVER_CLASS -> "org.postgresql.Driver",
      "url" -> url,
      "user" -> user,
      "stringtype" -> "unspecified"
    )

yourData.write.mode(SaveMode.Append)
    .format("jdbc")
    .options(props)
    .option("dbtable", tableName)
    .save()

The table should be created with the uuid column already defined with type uuid. If you try to have Spark 2.3.0 create this table though, you will again hit a wall:

yourData.write.mode(SaveMode.Overwrite) .format("jdbc") .options(props) .option("dbtable", tableName) .option("createTableColumnTypes", "some_uuid_column_name uuid") .save()

DataType uuid is not supported.(line 1, pos 21)



来源:https://stackoverflow.com/questions/47368906/cant-find-uuid-in-org-apache-spark-sql-types-datatypes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!