How Hibernate Batch insert works?

*爱你&永不变心* 提交于 2019-12-20 02:39:10

问题


Can some one explain me how

hibernate.jdbc.batch_size=1000 

and

if (i % 100 == 0 && i>0) {
                    session.flush();
                    session.clear();
                }

together works ? ...


回答1:


Hibernate property hibernate.jdbc.batch_size is a way for hibernate to optimize your insert or update statetment whereas flushing loop is about memory exhaustion.

Without batchsize when you try to save an entity hibernate fire 1 insert statement, thus if you work with a big collection, for each save hibernate fire 1 statement

Imagine the following chunk of code :

for(Entity e : entities){
session.save(e);
}

Here hibernate will fire 1 insert statement per entity in your collection. if you have 100 elements in your collection so 100 insert statements will be fire. This approach is not very efficient for 2 main reasons:

  • 1) You increase exponentially your 1st level cache and you'll probably finish soon with an OutOfMemoryException.
  • 2) You degrade performance due to network round trip for each statement.

hibernate.jdbc.batch_size and the flushing loop have 2 differents purposes but are complementary.

Hibernate use the first to control how many entities will be in batch. Under the cover Hibernate use java.sql.Statement.addBatch(...) and executeBatch() methods.

So hibernate.jdbc.batch_size tells hibernate how many times it have to call addBatch() before calling executeBatch().

So setting this property doesn't prevent you of memory exhaution.

In order to take care of the memory you have to flush your session on a regular basis and this is the purpose of flushing loop.

When you write :

for(Entity e : entities){
if (i % 100 == 0 && i>0) {
                    session.flush();
                    session.clear();
                }
}

you're telling hibernate to flush and clear the session every 100 entities (you release memory).

So now what is the link between the 2 ?

In order to be optimal you have to define your jdbc.batch_size and your flushing param identical.

if you define a flush param lower that the batch_size you choose so hibernate will flush the session more frequently so it will create small batch until it arrive to btach size which is not efficient

when the 2 are the same hibernate will only execute batches of optimal size except for the last one if size of collection is not a multiple of your batch_size.

You can see the following post for more details about this last point




回答2:


hibernate.jdbc.batch_size determines the maximum batch size that is executed. If implicit or explicit flush is performed before the specified batch size is reached (the number of pending insert or update statements for the same table), all pending statements are packed in one batch, and the 'accumulation' of statements is restarted.

So, in your example you would execute batches consisting of 100 statements each. Or, for example, if the batch size were 100 and the modulo divider were 500, when the flush operation occurs you would execute 5 batches consisting of 100 statements each.




回答3:


Batch Processing allows you to group related SQL statements into a batch and submit them with one call to the database.

Why we need

It is important to keep in mind, that each update added to a Statement or PreparedStatement is executed separately by the database. That means, that some of them may succeed before one of them fails. All the statements that have succeeded are now applied to the database, but the rest of the updates may not be. This can result in an inconsistent data in the database.

To avoid this, you can execute the batch update inside a transaction. When executed inside a transaction you can make sure that either all updates are executed, or none are. Any successful updates can be rolled back, in case one of the updates fail.

What is Batch and Flushing

Batch size and flushing is different thing. when you set hibernate.jdbc.batch_size to 1000 it means hibernate will do batch inserts or update upto 1000 entities.flush operation can be used the write all changes to the database before the transaction is committed

if your batch size is set to 1000, and you flush every 100 entity, Hibernate will execute lots of small batches of 100 insert or update statements for 10 times.

Please read more below this link:

http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/batch.html

Why number of objects being flushed should be equal to hibernate.jdbc.batch_size?



来源:https://stackoverflow.com/questions/45670583/how-hibernate-batch-insert-works

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!