Dataflow GroupBy -> multiple outputs based on keys

本小妞迷上赌 提交于 2020-01-07 05:02:13

问题


Is there any simple way that I can redirect the output of GroupBy into multiple output files based on Group keys?

Bin.apply(GroupByKey.<String, KV<Long,Iterable<TableRow>>>create())
.apply(ParDo.named("Print Bins").of( ... ) 
.apply(TextIO.Write.to(*Output file based on key*))

If Sink is the solution, would you please share a sample code w/ me?

Thanks!


回答1:


Beam 2.2 will include an API to do just that - TextIO.write().to(DynamicDestinations), see source. For now, if you'd like to use this API, you can use the 2.2.0-SNAPSHOT version. Note that this API is experimental and might change in Beam 2.3 or onwards.



来源:https://stackoverflow.com/questions/46715860/dataflow-groupby-multiple-outputs-based-on-keys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!