JPA: what is the proper pattern for iterating over large result sets?

后端 未结 15 2525
攒了一身酷
攒了一身酷 2020-11-27 09:50

Let\'s say I have a table with millions of rows. Using JPA, what\'s the proper way to iterate over a query against that table, such that I don\'t have all an in-memo

15条回答
  •  自闭症患者
    2020-11-27 10:35

    I was surprised to see that the use of stored procedures was not more prominent in the answers here. In the past when I've had to do something like this, I create a stored procedure that processes data in small chunks, then sleeps for a bit, then continues. The reason for the sleeping is to not overwhelm the database which is presumably also being used for more real time types of queries, such as being connected to a web site. If there is no one else using the database, then you can leave out the sleep. If you need to ensure that you process each record once and only once, then you will need to create an additional table (or field) to store which records you have processed in order to be resilient across restarts.

    The performance savings here are significant, possibly orders of magnitude faster than anything you could do in JPA/Hibernate/AppServer land, and your database server will most likely have its own server side cursor type of mechanism for processing large result sets efficiently. The performance savings come from not having to ship the data from the database server to the application server, where you process the data, and then ship it back.

    There are some significant downsides to using stored procedures which may completely rule this out for you, but if you've got that skill in your personal toolbox and can use it in this kind of situation, you can knock out these kinds of things fairly quickly.

提交回复
热议问题