Why does Hibernate disable INSERT batching when using an IDENTITY identifier generator
Transactional write-behind
Hibernate tries to defer the Persistence Context flushing up until the last possible moment. This strategy has been traditionally known as transactional write-behind.
The write-behind is more related to Hibernate flushing rather than any logical or physical transaction. During a transaction, the flush may occur multiple times.
The flushed changes are visible only for the current database transaction. Until the current transaction is committed, no change is visible by other concurrent transactions.
IDENTITY
The IDENTITY
generator allows an int
or bigint
column to be auto-incremented on demand. The increment process happens outside of the current running transaction, so a roll-back may end up discarding already assigned values (value gaps may happen).
The increment process is very efficient since it uses a database internal lightweight locking mechanism as opposed to the more heavyweight transactional course-grain locks.
The only drawback is that we can’t know the newly assigned value prior to executing the INSERT statement. This restriction is hindering the transactional write-behind flushing strategy adopted by Hibernate. For this reason, Hibernates disables the JDBC batch support for entities using the IDENTITY
generator.
TABLE
The only solution would be to use a TABLE
identifier generator, backed by a pooled-lo
optimizer. This generator works with MySQL too, so it overcomes the lack of database SEQUENCE support.
However, the TABLE
generator performs worse than IDENTITY
, so in the end, this is not a viable alternative.
Conclusion
Therefore, using IDENTITY is still the best choice on MySQL, and if you need batching for insert, you can use JDBC for that.
Spring Batch JPA Bulk Insert eats performance when using GenerationType.IDENTITY
Hibernate cannot batch insert entities if the entity is using IDENTITY
to generate its ID (Also mentioned in the docs at here).
So you have to change to use SEQUENCE
to generate the ID. And according to this , choose to use "pooled" or "pooled-lo" algorithm to get the new ID from the sequence in order to further improve the performance by reducing the round trips to get the ID.
So the ID mapping looks like :
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator="emp_sequence")
@SequenceGenerator(name="emp_sequence", sequenceName = "emp_id_seq", allocationSize = 100)
private Long id;
And the hibernate settings :
spring.jpa.properties.hibernate.order_inserts = true
spring.jpa.properties.hibernate.order_updates = true
spring.jpa.properties.hibernate.jdbc.batch_size = 1000
spring.jpa.properties.hibernate.jdbc.batch_versioned_data = true
# For using "pool-lo" optimiser for generating ID when using JPA @SequenceGenerator
spring.jpa.properties.hibernate.id.optimizer.pooled.preferred = pooled-lo
Also , you have to make sure the corresponding ID sequence in PostreSQL is aligned with the configuration in @SequenceGenerator
:
alter sequence emp_id_seq increment by 100;
Another tip is to add reWriteBatchedInserts=true
in the JDBC connection string which will provides 2-3x performance improvement as said from the docs.
JPA batch inserts on MySQL with identity generation strategy
The implication of this approach is that your entities are still detached from the persistent context and the created id is not set on the entities.
If that is fine for you, this is a fine approach.
Is Hibernate batching generating the correct statements?
yes, it's the right behavior.
Batching doesn't consist in generating a giant query. It consists in adding the same insert statement into a batch several times, with different parameters, and then to execute the batch. This allows still using the same prepared statement, and still allows doing a single round-trip to the database.
NHibernate disables insert batching at the ADO level transparently if I use an identiy identifier generator. Why?
Identity needs a round trip to the database, so you can know which id was generated. More details here :
- http://ayende.com/blog/3915/nhibernate-avoid-identity-generator-when-possible
- http://nhforge.org/blogs/nhibernate/archive/2009/03/20/nhibernate-poid-generators-revealed.aspx
- http://fabiomaulo.blogspot.com/2009/02/nh210-generators-behavior-explained.html
Related Topics
Java:Read Last N Lines of a Huge File
Using Variables Outside of an If-Statement
How to Get the Parent Base Class Object Super.Getclass()
How to Include Test Classes into Maven Jar and Execute Them
Access "This" from Java Anonymous Class
How to Read Excel Cell Having Date with Apache Poi
What Do Curly Braces in Java Mean by Themselves
How to Use Argumentcaptor for Stubbing
Exception Noclassdeffounderror for Cacheprovider
Jackson JSON Custom Serialization for Certain Fields
Java Swing Jtextfield Set Placeholder
Passing an Array or List to @Pathvariable - Spring/Java
Hashcode and Equals for Hashset
Turning an Executorservice to Daemon in Java