问题
Is there any simple way that I can redirect the output of GroupBy into multiple output files based on Group keys?
Bin.apply(GroupByKey.<String, KV<Long,Iterable<TableRow>>>create())
.apply(ParDo.named("Print Bins").of( ... )
.apply(TextIO.Write.to(*Output file based on key*))
If Sink is the solution, would you please share a sample code w/ me?
Thanks!
回答1:
Beam 2.2 will include an API to do just that - TextIO.write().to(DynamicDestinations), see source. For now, if you'd like to use this API, you can use the 2.2.0-SNAPSHOT version. Note that this API is experimental and might change in Beam 2.3 or onwards.
来源:https://stackoverflow.com/questions/46715860/dataflow-groupby-multiple-outputs-based-on-keys