How to flatten an Parquet Array datatype when using IBM Cloud SQL Query

问题

I have to push parquet file data which I am reading from IBM Cloud SQL Query to Db2 on Cloud.

My parquet file has data in array format, and I want to push that to DB2 on Cloud too.

Is there any way to push that array data of parquet file to Db2 on Cloud?

回答1:

Have you checked out this advise in the documentation?

https://cloud.ibm.com/docs/services/sql-query?topic=sql-query-overview#limitations

If a JSON, ORC, or Parquet object contains a nested or arrayed structure, a query with CSV output using a wildcard (for example, SELECT * from cos://...) returns an error such as "Invalid CSV data type used: struct." Use one of the following workarounds:

For a nested structure, use the FLATTEN table transformation function.

Alternatively, you can specify the fully nested column names instead of the wildcard, for example, SELECT address.city, address.street, ... from cos://....

For an array, use the Spark SQL explode() function, for example, select explode(contact_names) from cos://....

来源：https://stackoverflow.com/questions/60881918/how-to-flatten-an-parquet-array-datatype-when-using-ibm-cloud-sql-query

标签

db2

parquet

ibm-cloud-sql-query

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!