Avro vs. Parquet

后端 未结 7 988
自闭症患者
自闭症患者 2020-12-07 09:39

I\'m planning to use one of the hadoop file format for my hadoop related project. I understand parquet is efficient for column based query and avro for full

7条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-12-07 09:59

    Silver Blaze put description nicely with an example use case and described how Parquet was the best choice for him. It makes sense to consider one over the other depending on your requirements. I am putting up a brief description of different other file formats too along with time space complexity comparison. Hope that helps.

    There are a bunch of file formats that you can use in Hive. Notable mentions are AVRO, Parquet. RCFile & ORC. There are some good documents available online that you may refer to if you want to compare the performance and space utilization of these file formats. Follows some useful links that will get you going.

    This Blog Post

    This link from MapR [They don't discuss Parquet though]

    This link from Inquidia

    The above given links will get you going. I hope this answer your query.

    Thanks!

提交回复
热议问题