I need to do a massive insert using EJB 3, Hibernate, Spring Data and Oracle. Originally, I am using Spring Data and code is below:
talaoAITDAO.save(taloes);
I recently found a promising small library for batching inserts with Hibernate and Postgresql. It is called pedal-dialect and uses the Postgresql - command COPY
which is claimed by many people to be much faster than batched inserts (references: Postgresql manual, Postgresql Insert Strategies - Performance Test, How does copy work and why is it so much faster than insert?). pedal-dialect allows to use COPY
without fully losing the ease of use of Hibernate. You still get automatic mapping of entities and rows and don't have to implement it on your own.
A couple of things.
First your configuration properties are wrong order_inserts
must be hibernate.order_inserts
. Currently your setting is ignored and you haven't changed a thing.
Next use the EntityManager
instead of doing all that nasty hibernate stuff. The EntityManager
also has a flush
and clear
method. This should at least cleanup your method. Without the order this helps a little to cleanup the session and preventing dirty-checks on all the objects in there.
EntityManager em = getEntityManager();
int batchSize = 1000;
for (int i = 0; i < taloes.size(); i++) {
TalaoAIT talaoAIT = taloes.get(i);
em.persist(talaoAIT);
if(i % batchSize == 0) {
em.flush();
em.clear();
}
taloes.add(talaoAIT);
}
em.flush();
em.clear();
Next you shouldn't make your batches to large as that can cause memory problems, start with something like 50 and test which/what performs best. There is a point at which dirty-checking is going to take more time then flusing and clearing to the database. You want to find this sweet spot.
The solution posted by M. Deinum worked great for me, provided I set the following Hibernate properties in my JPA persistence.xml
file:
<property name="hibernate.jdbc.batch_size" value="50" />
<property name="hibernate.jdbc.batch_versioned_data" value="true" />
<property name="hibernate.order_inserts" value="true" />
<property name="hibernate.order_updates" value="true" />
<property name="hibernate.cache.use_second_level_cache" value="false" />
<property name="hibernate.connection.autocommit" value="false" />
I am using an Oracle database, so I also have this one defined:
<property name="hibernate.dialect" value="org.hibernate.dialect.Oracle10gDialect" />