Spring Batch correctly restart uncompleted jobs in clustered environment

偶尔善良 提交于 2019-12-01 19:45:42

Your logic is not restarting uncompleted jobs. Your logic is taking currently running job executions, setting their status to FAILED and restarting them. Your logic should not find running executions, it should look for not currently running executions, especially failed ones and restart them.

How to correctly restart the failed jobs and prevent the situation when the jobs like jobInstance2 will be also restarted?

In pseudo code, what you need to do to achieve this is:

  1. Get the job instances of your job with JobOperator#getJobInstances
  2. For each instance, check if there is a running execution using JobOperator#getExecutions.

    2.1 If there is a running execution, move to next instance (in order to let the execution finish either successfully or with a failure)

    2.2 If there is no currently running execution, check the status of the last execution and restart it if failed using JobOperator#restart.

In your scenario:

  • jobInstance1 should be restarted in step 2.2
  • jobInstance2 should be filtered in step 2.1 since there is a running execution for it on node 2.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!