The Dark Side of Hibernate Auto Flush
Join the DZone community and get the full member experience.
Join For FreeIntroduction
Now that I described the the basics of JPA and Hibernate flush strategies, I can continue unraveling the surprising behavior of Hibernate’s AUTO flush mode.
Not all queries trigger a Session flush
Many would assume that Hibernate always flushes the Session before any executing query. While this might have been a more intuitive approach, and probably closer to the JPA’s AUTO FlushModeType, Hibernate tries to optimize that. If the current executed query is not going to hit the pending SQL INSERT/UPDATE/DELETE statements then the flush is not strictly required.
As stated in the reference documentation, the AUTO flush strategy may sometimessynchronize the current persistence context prior to a query execution. It would have been more intuitive if the framework authors had chosen to name it FlushMode.SOMETIMES.
JPQL/HQL and SQL
Like many other ORM solutions, Hibernate offers a limited Entity querying language (JPQL/HQL) that’s very much based on SQL-92 syntax.
The entity query language is translated to SQL by the current database dialect and so it must offer the same functionality across different database products. Since most database systems are SQL-92 complaint, the Entity Query Language is an abstraction of the most common database querying syntax.
While you can use the Entity Query Language in many use cases (selecting Entities and even projections), there are times when its limited capabilities are no match for an advanced querying request. Whenever we want to make use of some specific querying techniques, such as:
we have no other option, but to run native SQL queries.
Hibernate is a persistence framework. Hibernate was never meant to replace SQL. If some query is better expressed in a native query, then it’s not worth sacrificing application performance on the altar of database portability.
AUTO flush and HQL/JPQL
First we are going to test how the AUTO flush mode behaves when an HQL query is about to be executed. For this we define the following unrelated entities:
The test will execute the following actions:
- A Person is going to be persisted.
- Selecting User(s) should not trigger a the flush.
- Querying for Person, the AUTO flush should trigger the entity state transition synchronization (A person INSERT should be executed prior to executing the select query).
1
2
3
4
|
Product product = newProduct();
session.persist(product);
assertEquals(0L, session.createQuery("select count(id) from User").uniqueResult());
assertEquals(product.getId(), session.createQuery("select p.id from Product p").uniqueResult());
|
Giving the following SQL output:
1
2
3
4
|
[main]: o.h.e.i.AbstractSaveEventListener - Generated identifier: f76f61e2-f3e3-4ea4-8f44-82e9804ceed0, using strategy: org.hibernate.id.UUIDGenerator
Query:{[selectcount(user0_.id) as col_0_0_ from user user0_][]}
Query:{[insert into product (color, id) values (?, ?)][12,f76f61e2-f3e3-4ea4-8f44-82e9804ceed0]}
Query:{[selectproduct0_.idas col_0_0_ from product product0_][]}
|
As you can see, the User select hasn’t triggered the Session flush. This is because Hibernate inspects the current query space against the pending table statements. If the current executing query doesn’t overlap with the unflushed table statements, the a flush can be safely ignored.
HQL can detect the Product flush even for:
- Sub-selects
12345
session.persist(product);
assertEquals(0L, session.createQuery(
"select count(*) "+
"from User u "+
"where u.favoriteColor in (select distinct(p.color) from Product p)").uniqueResult());
12Query:{[insert into product (color, id) values (?, ?)][Blue,2d9d1b4f-eaee-45f1-a480-120eb66da9e8]}
Query:{[selectcount(*) as col_0_0_ from user user0_ where user0_.favoriteColor in(selectdistinct product1_.color from product product1_)][]}
- Or theta-style joins
12345
session.persist(product);
assertEquals(0L, session.createQuery(
"select count(*) "+
"from User u, Product p "+
"where u.favoriteColor = p.color").uniqueResult());
12Query:{[insert into product (color, id) values (?, ?)][Blue,4af0b843-da3f-4b38-aa42-1e590db186a9]}
Query:{[selectcount(*) as col_0_0_ from user user0_ cross joinproduct product1_ where user0_.favoriteColor=product1_.color][]}
The reason why it works is because Entity Queries are parsed and translated to SQL queries. Hibernate cannot reference a non existing table, therefore it always knows the database tables an HQL/JPQL query will hit.
So Hibernate is only aware of those tables we explicitly referenced in our HQL query. If the current pending DML statements imply database triggers or database level cascading, Hibernate won’t be aware of those. So even for HQL, the AUTO flush mode can cause consistency issues.
If you enjoy reading this article, you might want to subscribe to my newsletter and get a discount for my book as well.
AUTO flush and native SQL queries
When it comes to native SQL queries, things are getting much more complicated. Hibernate cannot parse SQL queries, because it only supports a limited database query syntax. Many database systems offer proprietary features that are beyond Hibernate Entity Query capabilities.
Querying the Person table, with a native SQL query is not going to trigger the flush, causing an inconsistency issue:
1
2
3
|
Product product = newProduct();
session.persist(product);
assertNull(session.createSQLQuery("select id from product").uniqueResult());
|
1
2
3
|
DEBUG [main]: o.h.e.i.AbstractSaveEventListener - Generated identifier: 718b84d8-9270-48f3-86ff-0b8da7f9af7c, using strategy: org.hibernate.id.UUIDGenerator
Query:{[selectidfrom product][]}
Query:{[insert into product (color, id) values (?, ?)][12,718b84d8-9270-48f3-86ff-0b8da7f9af7c]}
|
The newly persisted Product was only inserted during transaction commit, because the native SQL query didn’t triggered the flush. This is major consistency problem, one that’s hard to debug or even foreseen by many developers. That’s one more reason for always inspecting auto-generated SQL statements.
The same behaviour is observed even for named native queries:
1
2
3
4
|
@NamedNativeQueries(
@NamedNativeQuery(name = "product_ids", query = "select id from product")
)
assertNull(session.getNamedQuery("product_ids").uniqueResult());
|
So even if the SQL query is pre-loaded, Hibernate won’t extract the associated query space for matching it against the pending DML statements.
Overruling the current flush strategy
Even if the current Session defines a default flush strategy, you can always override it on a query basis.
Query flush mode
The ALWAYS mode is going to flush the persistence context before any query execution (HQL or SQL). This time, Hibernate applies no optimization and all pending entity state transitions are going to be synchronized with the current database transaction.
1
|
assertEquals(product.getId(), session.createSQLQuery("select id from product").setFlushMode(FlushMode.ALWAYS).uniqueResult());
|
Instructing Hibernate which tables should be syncronized
You could also add a synchronization rule on your current executing SQL query. Hibernate will then know what database tables need to be syncronzied prior to executing the query. This is also useful for second level caching as well.
1
|
assertEquals(product.getId(), session.createSQLQuery("select id from product").addSynchronizedEntityClass(Product.class).uniqueResult());
|
If you enjoyed this article, I bet you are going to love my book as well.
Conclusion
The AUTO flush mode is tricky and fixing consistency issues on a query basis is a maintainer’s nightmare. If you decide to add a database trigger, you’ll have to check all Hibernate queries to make sure they won’t end up running against stale data.
My suggestion is to use the ALWAYS flush mode, even if Hibernate authors warned us that:
this strategy is almost always unnecessary and inefficient.
Inconsistency is much more of an issue that some occasional premature flushes. While mixing DML operations and queries may cause unnecessary flushing this situation is not that difficult to mitigate. During a session transaction, it’s best to execute queries at the beginning (when no pending entity state transitions are to be synchronized) and towards the end of the transaction (when the current persistence context is going to be flushed anyway).
The entity state transition operations should be pushed towards the end of the transaction, trying to avoid interleaving them with query operations (therefore preventing a premature flush trigger).
Published at DZone with permission of Vlad Mihalcea. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments