Simple explanation of MapReduce?

前端 未结 8 1154
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-04 04:57

Related to my CouchDB question.

Can anyone explain MapReduce in terms a numbnuts could understand?

8条回答
  •  庸人自扰
    2020-12-04 05:00

    MapReduce is a method to process vast sums of data in parallel without requiring the developer to write any other code other than the mapper and reduce functions.

    The map function takes data in and churns out a result, which is held in a barrier. This function can run in parallel with a large number of the same map task. The dataset can then be reduced to a scalar value.

    So if you think of it like a SQL statement

    SELECT SUM(salary)
    FROM employees
    WHERE salary > 1000
    GROUP by deptname
    

    We can use map to get our subset of employees with salary > 1000 which map emits to the barrier into group size buckets.

    Reduce will sum each of those groups. Giving you your result set.

    just plucked this from my university study notes of the google paper

提交回复
热议问题