Google Cloud dataflow : Read from a file with dynamic filename

我怕爱的太早我们不能终老 提交于 2019-12-07 12:02:31

The natural way to express your read would be by using TextIO.readAll() method, which reads text files from an input PCollection of file names. This method has been introduced within the Beam codebase, but is not currently in a released version. It will be included in the Beam 2.2.0 release and the corresponding Dataflow 2.2.0 release.

You can get this done with using SerializableFunction.

You can do

pipeline.apply(TextIO.read().from(new FileNameFn()));

public class FileNameFn implements SerializableFunction<inputFileNameString, outputQualifiedFileNameStringWithBucket>

Obvious you can pass bucket name and other parameter statically while creating this class instance by constructor arguments.

Hope this will help.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!