Spring Batch: problems (mix data) when converting to multithread

前端 未结 1 1828
眼角桃花
眼角桃花 2020-12-03 16:10

Maybe this is a recurrent question, but I need some customization with my context.

I\'m using Spring Batch 3.0.1.RELEASE

I have a simple job with some steps.

1条回答
  •  一整个雨季
    2020-12-03 16:37

    You have asked a lot in your question (in the future, please break this type of question up into multiple, more specific questions). However, item by item:

    Is JdbcCursorItemReader thread-safe?
    As the documentation states, it is not. The reason for this is that the JdbcCursorItemReader wraps a single ResultSet which is not thread safe.

    Are the composite processor and writer right for multithread?
    The CompositeItemProcessor provided by Spring Batch is considered thread safe as long as the delegate ItemProcessor implementations are thread safe as well. You provide no code in relation to your implementations or their configurations so I can't verify their thread safety. However, given the symptoms you are describing, my hunch is that there is some form of thread safety issues going on within your code.

    You also don't identify what ItemWriter implementations or their configurations you are using so there may be thread related issues there as well.

    If you update your question with more information about your implementations and configurations, we can provide more insight.

    How could I make a custom thread-safe composite processor?
    There are two things to consider when implementing any ItemProcessor:

    1. Make it thread safe: Following basic thread safety rules (read the book Java Concurrency In Practice for the bible on the topic) will allow you to scale your components by just adding a task executor.
    2. Make it idempotent: During skip/retry processing, items may be re-processed. By making your ItemProcessor implementation idempotent, this will prevent side effects from this multiple trips through a processor.

    Maybe could it be the JDBC reader: Is there any thread-safe JDBC reader for multi-thread?
    As you have noted, the JdbcPaginingItemReader is thread safe and noted as such in the documentation. When using multiple threads, each chunk is executed in it's own thread. If you've configured the page size to match the commit-interval, that means each page is processed in the same thread.

    Other options for scaling a single step
    While you went down the path of implementing a single, multi-threaded step, there may be better options. Spring Batch provides 5 core scaling options:

    1. Multithreaded step - As you are trying right now.
    2. Parallel Steps - Using Spring Batch's split functionality you can execute multiple steps in parallel. Given that you're working with composite ItemProcessor and composite ItemWriters in the same step, this may be something to explore (breaking your current composite scenarios into multiple, parallel steps).
    3. Async ItemProcessor/ItemWriters - This option allows you to execute the processor logic in a different thread. The processor spins the thread off and returns a Future to the AsyncItemWriter which will block until the Future returns to be written.
    4. Partitioning - This is the division of the data into blocks called partitions that are processed in parallel by child steps. Each partition is processed by an actual, independent step so using step scoped components can prevent thread safety issues (each step gets it's own instance). Partition processing can be preformed either locally via threads or remotely across multiple JVMs.
    5. Remote Chunking - This option farms the processor logic out to other JVM processes. It really should only be used if the ItemProcessor logic is the bottle neck in the flow.

    You can read about all of these options in the documentation for Spring Batch here: http://docs.spring.io/spring-batch/trunk/reference/html/scalability.html

    Thread safety is a complex problem. Just adding multiple threads to code that used to work in a single threaded environment will typically uncover issues in your code.

    0 讨论(0)
提交回复
热议问题