Multi threading database reading

后端 未结 3 606
星月不相逢
星月不相逢 2021-02-04 22:16

In our Java application I have requirement to read the around 80 million records from oracle database. I am trying to redesign the multithreading program for this. Currently we

3条回答
  •  忘掉有多难
    2021-02-04 22:29

    1. Create a dispatching thread which reads the PKs (the IDs) of N rows. Here you can do some sort of caching - read N=1000 rows, give them to Worker1, read next N=1000 rows, give them to Worker2, etc. This way, you don't need to keep more than N=1000 IDs (PKs) in-memory in the dispatcher thread. Once you pass the work (the work is these N=1000 IDs) to the Worker thread, you dispose them in the dispatcher thread (no need to keep them).

    2. Each worker thread takes its N (e.g. 1000) PKs/IDs and using them gets the rows from the DBs. Make sure here that you use rowlock (T-SQL) or its equivalent if you're not using SQL Server. This way, the threads will not get into each other's way. So worker reads N rows from the DB and processes them. Once complete it may signal the dispatcher (something like "I am done" event).

    This is the initial idea that comes to my mind. I guess it could be refined further if you think some more on it.

提交回复
热议问题