Grid Size in Spring batch

此生再无相见时 提交于 2019-12-18 04:18:28

问题


I have batch job which reads data from bulk files, process it and insert in DB.

I'm using spring's partitioning features using the default partition handler.

    <bean class="org.spr...TaskExecutorPartitionHandler">
          <property name="taskExecutor" ref="taskExecutor"/>
          <property name="step" ref="readFromFile" />
          <property name="gridSize" value="10" />
    </bean>

What is the significance of the gridSize here ? I have configured in such a way that it is equal to the concurrency in taskExecutor.


回答1:


gridSize specifies the number of data blocks to create to be processed by (usually) the same number of workers. Think about it as a number of mapped data blocks in a map/reduce.

Using a StepExecutionSplitter, given the data, PartitionHandler "partitions" / splits the data to a gridSize parts, and sends each part to an independent worker => thread in your case.

For example, you have 10 rows in DB that need to be processed. If you set the gridSize to be 5, and you are using a straightforward partition logic, you'd end up with 10 / 5 = 2 rows per thread => 5 threads working concurrently on 2 rows each.




回答2:


Per the API,

Passed to the StepExecutionSplitter in the handle(StepExecutionSplitter, StepExecution) method, instructing it how many StepExecution instances are required, ideally. The StepExecutionSplitter is allowed to ignore the grid size in the case of a restart, since the input data partitions must be preserved.




回答3:


grid size is nothing but set of task (assume as sack of bags ) a single partitioned step will lift for processing. After done with all taken task( sack of bags) it will come back for next set of task (sack of bags).



来源:https://stackoverflow.com/questions/7759156/grid-size-in-spring-batch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!