Write Parquet format to HDFS using Java API with out using Avro and MR

与世无争的帅哥 提交于 2019-12-01 05:59:56
loicmathieu

Effectively, there is not a lot of sample available for reading/writing Apache parquet files without the help of an external framework.

The core parquet library is parquet-column where you can find some test files reading/writing directly : https://github.com/apache/parquet-mr/blob/master/parquet-column/src/test/java/org/apache/parquet/io/TestColumnIO.java

You then just need to use the same functionality with an HDFS file. You can follow this SOW question for this : Accessing files in HDFS using Java

UPDATED : to respond to the deprecated parts of the API : AvroWriteSupport should be replaced by AvroParquetWriter and I check ParquetWriter it's not deprecated and can be used safely.

Regards,

Loïc

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!