Pig vs Hive vs Native Map Reduce

后端未结

关注

 7  2227

无人及你 2020-12-14 01:55

I\'ve basic understanding on what Pig, Hive abstractions are. But I don\'t have a clear idea on the scenarios that require Hive, Pig or native map reduce.

I went thr

7条回答

误落风尘 (楼主)

2020-12-14 02:43

Hive

Pros:

Sql like Data-base guys love that. Good support for structured data. Currently support database schema and views like structure Support concurrent multi users, multi session scenarios. Bigger community support. Hive , Hiver server , Hiver Server2, Impala ,Centry already

Cons: Performance degrades as data grows bigger not much to do, memory over flow issues. cant do much with it. Hierarchical data is a challenge. Un-structured data requires udf like component Combination of multiple techniques could be a nightmare dynamic portions with UTDF in case of big data etc

Pig: Pros: Great script based data flow language.

Cons:

Un-structured data requires udf like component Not a big community support

MapReudce: Pros: Dont agree with "hard to achieve join functionality", if you understand what kind of join you want to implement you can implement with few lines of code. Most of the times MR yields better performance. MR support for hierarchical data is great especially implement tree like structures. Better control at partitioning / indexing the data. Job chaining.

Cons: Need to know api very well to get a better performance etc Code / debug / maintain

0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...