Howto aggregate on full data set in Spring Batch jobs?

扶醉桌前 提交于 2019-12-11 13:05:24

问题


I need to insert aggregation in my Spring Batch jobs. But the aggregation step need to have the entire data set available.

In pure SQL, it's easy to code SQL aggregation requests : the full data set (as stored in database) is available.

But in Spring Batch jobs, everything is done in memory, and spread in chunked. So howto deal with that kind of data strewing ?

Do you have any advice concerning the best practices to insert aggregation steps/processes ?

Thx a lot for your enlightments


回答1:


You have Partitioning option in spring batch which can have StepExecutionAggregator, it has aggregate method which accepts list of StepContext of all partitioned steps.

We had i.e. integration with soap server where we first received list of something that needs to be processed, than we partitioned it to child steps and processed in parallel and after each child step finishes aggregator is invoked which can do stuff based on data in child step context.

It is good way if you have something in your data which can be good rule for partitioning (i.e. pull list of items from DB and process each item in parallel, save item data in step context, use aggregator and combine everything in each step context and do common operation on combined data).

Here is link to example with partitioning (there is no aggregation but you can add it to masterStep).



来源:https://stackoverflow.com/questions/29486995/howto-aggregate-on-full-data-set-in-spring-batch-jobs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!