Hive Create Multi small files for each insert in HDFS

前端 未结 3 641
轻奢々
轻奢々 2020-12-14 13:19

following is already been achieved

  1. Kafka Producer pulling data from twitter using Spark Streaming.
  2. Kafka Consumer ingesting data into Hive External t
3条回答
  •  旧时难觅i
    2020-12-14 14:09

    you can use these options together.

    1. turn on acid
    2. create orc table K with transactional property.
    3. insert many times into K. by streaming or just use insert dml.
    4. hive will automatically create small delta files
    5. minor ir major compactions will happen
    6. small files will be merged to large file.

提交回复
热议问题