How to append data to an existing parquet file

后端 未结 2 1827
长发绾君心
长发绾君心 2021-01-04 08:53

I\'m using the following code to create ParquetWriter and to write records to it.

ParquetWriter parquetWriter = new ParquetWriter(path,          


        
2条回答
  •  没有蜡笔的小新
    2021-01-04 09:22

    Parquet is a columnar file, It optimizes writing all columns together. If any edit it requires to rewrite the file.

    From Wiki

    A column-oriented database serializes all of the values of a column together, then the values of the next column, and so on. For our example table, the data would be stored in this fashion:

    10:001,12:002,11:003,22:004;
    Smith:001,Jones:002,Johnson:003,Jones:004;
    Joe:001,Mary:002,Cathy:003,Bob:004;
    40000:001,50000:002,44000:003,55000:004;
    

    Some links

    https://en.wikipedia.org/wiki/Column-oriented_DBMS

    https://parquet.apache.org/

提交回复
热议问题