JPA: what is the proper pattern for iterating over large result sets?

后端 未结 15 2579
攒了一身酷
攒了一身酷 2020-11-27 09:50

Let\'s say I have a table with millions of rows. Using JPA, what\'s the proper way to iterate over a query against that table, such that I don\'t have all an in-memo

15条回答
  •  误落风尘
    2020-11-27 10:16

    I have wondered this myself. It seems to matter:

    • how big your dataset is (rows)
    • what JPA implementation you are using
    • what kind of processing you are doing for each row.

    I have written an Iterator to make it easy to swap out both approaches (findAll vs findEntries).

    I recommend you try both.

    Long count = entityManager().createQuery("select count(o) from Model o", Long.class).getSingleResult();
    ChunkIterator it1 = new ChunkIterator(count, 2) {
    
        @Override
        public Iterator getChunk(long index, long chunkSize) {
            //Do your setFirst and setMax here and return an iterator.
        }
    
    };
    
    Iterator it2 = List models = entityManager().createQuery("from Model m", Model.class).getResultList().iterator();
    
    
    public static abstract class ChunkIterator 
        extends AbstractIterator implements Iterable{
        private Iterator chunk;
        private Long count;
        private long index = 0;
        private long chunkSize = 100;
    
        public ChunkIterator(Long count, long chunkSize) {
            super();
            this.count = count;
            this.chunkSize = chunkSize;
        }
    
        public abstract Iterator getChunk(long index, long chunkSize);
    
        @Override
        public Iterator iterator() {
            return this;
        }
    
        @Override
        protected T computeNext() {
            if (count == 0) return endOfData();
            if (chunk != null && chunk.hasNext() == false && index >= count) 
                return endOfData();
            if (chunk == null || chunk.hasNext() == false) {
                chunk = getChunk(index, chunkSize);
                index += chunkSize;
            }
            if (chunk == null || chunk.hasNext() == false) 
                return endOfData();
            return chunk.next();
        }
    
    }
    

    I ended up not using my chunk iterator (so it might not be that tested). By the way you will need google collections if you want to use it.

提交回复
热议问题