Apache Beam : FlatMap vs Map?

↘锁芯ラ 提交于 2019-11-29 00:06:47

问题


I want to understand in which scenario that I should use FlatMap or Map. The documentation did not seem clear to me.

I still do not understand in which scenario I should use the transformation of FlatMap or Map.

Could someone give me an example so I can understand their difference?

I understand the difference of FlatMap vs Map in Spark, and however not sure if there any similarity?


回答1:


These transforms in Beam are exactly same as Spark (Scala too).

A Map transform, maps from a PCollection of N elements into another PCollection of N elements.

A FlatMap transform maps a PCollections of N elements into N collections of zero or more elements, which are then flattened into a single PCollection.

As a simple example, the following happens:

beam.Create([1, 2, 3]) | beam.Map(lambda x: [x, 'any'])
# The result is a collection of THREE lists: [[1, 'any'], [2, 'any'], [3, 'any']]

Whereas:

beam.Create([1, 2, 3]) | beam.FlatMap(lambda x: [x, 'any'])
# The lists that are output by the lambda, are then flattened into a
# collection of SIX single elements: [1, 'any', 2, 'any', 3, 'any']


来源:https://stackoverflow.com/questions/45670930/apache-beam-flatmap-vs-map

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!