MapReduce alternatives

你离开我真会死。 提交于 2019-12-20 08:41:19

问题


Are there any alternative paradigms to MapReduce (Google, Hadoop)? Is there any other reasonable way how to split & merge big problems?


回答1:


Definitively. Check out, for example, Bulk Synchronous Parallel. Map/Reduce is in fact a very restricted way of reducing problems, however that restriction makes it manageable in a framework like Hadoop. The question is if it is less trouble to press your problem into a Map/Reduce setting, or if its easier to create a domain-specific parallelization scheme and having to take care of all the implementation details yourself. Pig, in fact, is only an abstraction layer on top of Hadoop which automates many standard problem transformations from not-Map-Reduce-y to Map-Reduce-compatible.

Edit 26.1.13: Found a nice up-to-date overview here




回答2:


Phil Colella identified seven numerical methods for scientific computation based on the patterns of scattering and gathering of data between processing nodes, and called them 'dwarfs'. These have been added to by others, a list is available at the Dwarf Mine:

  1. Dense Linear Algebra
  2. Sparse Linear Algebra
  3. Spectral Methods
  4. N-Body Methods
  5. Structured Grids
  6. Unstructured Grids
  7. MapReduce
  8. Combinational Logic
  9. Graph Traversal
  10. Dynamic Programming
  11. Backtrack and Branch-and-Bound
  12. Graphical Models
  13. Finite State Machines



回答3:


Update (August 2014): Stratosphere is now called Apache Flink (incubating).

Have a look at Stratosphere. It is another Big Data runtime that offers more operators (map, reduce, join, union, cross, iterate, ...). It also allows to define advanced data flow graphs (with Hadoop MR, you would have to chain jobs).

Stratosphere also supports BSP with its graph processing abstraction (called Spargel).

If you like to read scientific papers, have a look at Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing, it explains the theoretical backgrounds of the system.

Another system in the field is Spark which has its own model (RDDs). Since BSP has been mentioned here, also have a look at GraphLab, the offer an alternative to BSP.




回答4:


Microsoft's Dryad is claimed to be more general than MapReduce.




回答5:


Best alternate for MapReduce is Spark, because its 10 to 100 times faster than the MapReduce. And also very easy to maintain, less coding high performance.



来源:https://stackoverflow.com/questions/8692806/mapreduce-alternatives

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!