问题
Is there a way to create a dynamic DataSink output path in Flink?
DataSet has data type as Tuple2<String, String>
When we tried using stream I had a way to generate dynamic bath using custom Bucketer like below
@Override
public Path getBucketPath(Clock clock, Path basePath, Tuple2<String, String> element) {
return new Path(basePath + "/schema=" + element.f0.toLowerCase().trim() + "/");
}
I would like to know is there a similar way to handle in DataSet for generating the custom path.
回答1:
I poked around a bit, and didn't find anything similar for batch processing. Which means I think you'd have to create your own OutputFormat
class that wraps a regular FileOutputFormat
and does bucketing, using the same Bucketer interface.
来源:https://stackoverflow.com/questions/54690095/how-to-generate-dynamic-path-in-dataset-during-the-output-method