Apache Beam : FlatMap vs Map?

后端 未结 3 569
渐次进展
渐次进展 2020-12-14 07:35

I want to understand in which scenario that I should use FlatMap or Map. The documentation did not seem clear to me.

I still do not understand in which scenario I sh

3条回答
  •  情书的邮戳
    2020-12-14 08:09

    These transforms in Beam are exactly same as Spark (Scala too).

    A Map transform, maps from a PCollection of N elements into another PCollection of N elements.

    A FlatMap transform maps a PCollections of N elements into N collections of zero or more elements, which are then flattened into a single PCollection.

    As a simple example, the following happens:

    beam.Create([1, 2, 3]) | beam.Map(lambda x: [x, 'any'])
    # The result is a collection of THREE lists: [[1, 'any'], [2, 'any'], [3, 'any']]
    

    Whereas:

    beam.Create([1, 2, 3]) | beam.FlatMap(lambda x: [x, 'any'])
    # The lists that are output by the lambda, are then flattened into a
    # collection of SIX single elements: [1, 'any', 2, 'any', 3, 'any']
    

提交回复
热议问题