How to flatten an Parquet Array datatype when using IBM Cloud SQL Query

微笑、不失礼 提交于 2020-04-16 08:37:29

问题


I have to push parquet file data which I am reading from IBM Cloud SQL Query to Db2 on Cloud.

My parquet file has data in array format, and I want to push that to DB2 on Cloud too.

Is there any way to push that array data of parquet file to Db2 on Cloud?


回答1:


Have you checked out this advise in the documentation?

https://cloud.ibm.com/docs/services/sql-query?topic=sql-query-overview#limitations

If a JSON, ORC, or Parquet object contains a nested or arrayed structure, a query with CSV output using a wildcard (for example, SELECT * from cos://...) returns an error such as "Invalid CSV data type used: struct." Use one of the following workarounds:

  • For a nested structure, use the FLATTEN table transformation function.
  • Alternatively, you can specify the fully nested column names instead of the wildcard, for example, SELECT address.city, address.street, ... from cos://....
  • For an array, use the Spark SQL explode() function, for example, select explode(contact_names) from cos://....


来源:https://stackoverflow.com/questions/60881918/how-to-flatten-an-parquet-array-datatype-when-using-ibm-cloud-sql-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!