发表新帖

发表新帖

Data sharing in Hadoop Map Reduce chaining

后端未结

关注

 2  585

天命终不由人 2021-01-06 20:28

Is it possible to share a value between successive reducer and mapper?

Or is it possible to store the output of first reducer into memory and second mapper can acces

2条回答

独厮守ぢ (楼主)

2021-01-06 21:04

Each job is independent of each other, so without storing the output in intermediate location it's not possible to share the data across jobs.

FYI, in MapReduce model the map tasks don't talk to each other. Same is the case for reduce tasks also. Apache Giraph which runs on Hadoop uses communication between the mappers in the same job for iterative algorithms which requires the same job to be run again and again without communication between the mappers.

Not sure about the algorithm being implemented and why MR, but every MR algorithm can be implemented in BSP also. Here is a paper comparing BSP with MR. Some of the algorithms perform well in BSP when compared to MR. Apache Hama is an implementation of the BSP model, the way Apache Hadoop is an implementation of MR.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题