Pitfalls of the Hibernate Second-Level / Query Caches
This post will go through how to setup the Hibernate Second-Level and Query caches, how they work and what are their most common pitfalls.
Join the DZone community and get the full member experience.
Join For FreeThis post will go through how to setup the Hibernate Second-Level and Query caches, how they work and what are their most common pitfalls.
The Hibernate second level cache is an application level cache for storing entity data. The query cache is a separate cache that stores query results only.
The two caches really go together, as there are not many cases where we would like to use one without the other. When well used these caches provide improved performance in a transparent way, by reducing the number of SQL statements that hit the database.
How does the second level-cache work?
The second level cache stores the entity data, but NOT the entities themselves. The data is stored in a 'dehydrated' format which looks like a hash map where the key is the entity Id, and the value is a list of primitive values.
Here is an example on how the contents of the second-level cache look:
*-----------------------------------------*
| Person Data Cache |
|-----------------------------------------|
| 1 -> [ "John" , "Q" , "Public" , null ] |
| 2 -> [ "Joey" , "D" , "Public" , 1 ] |
| 3 -> [ "Sara" , "N" , "Public" , 1 ] |
*-----------------------------------------*
The second level cache gets populated when an object is loaded by Id from the database, using for example entityManager.find()
, or when traversing lazy initialized relations.
How does the query cache work?
The query cache looks conceptually like an hash map where the key is composed by the query text and the parameter values, and the value is a list of entity Id's that match the query:
*----------------------------------------------------------*
| Query Cache |
|----------------------------------------------------------|
| ["from Person where firstName=?", ["Joey"] ] -> [1, 2] ] |
*----------------------------------------------------------*
Some queries don't return entities, instead they return only primitive values. In those cases the values themselves will be stored in the query cache. The query cache gets populated when a cacheable JPQL/HQL query gets executed.
What is the relation between the two caches?
If a query under execution has previously cached results, then no SQL statement is sent to the database. Instead the query results are retrieved from the query cache, and then the cached entity identifiers are used to access the second level cache.
If the second level cache contains data for a given Id, it re-hydrates the entity and returns it. If the second level cache does not contain the results for that particular Id, then an SQL query is issued to load the entity from the database.
How to setup the two caches in an application
The first step is to include the hibernate-ehcache
jar in the classpath:
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-ehcache</artifactId>
<version>SOME-HIBERNATE-VERSION</version>
</dependency>
The following parameters need to be added to the configuration of your EntityManagerFactory
or SessionFactory
:
<prop key="hibernate.cache.use_second_level_cache">true</prop>
<prop key="hibernate.cache.use_query_cache">true</prop>
<prop key="hibernate.cache.region.factory_class">org.hibernate.cache.ehcache.EhCacheRegionFactory</prop>
<prop key="net.sf.ehcache.configurationResourceName">/your-cache-config.xml</prop>
Prefer using EhCacheRegionFactory
instead of SingletonEhCacheRegionFactory
. Using EhCacheRegionFactory
means that Hibernate will create separate cache regions for Hibernate caching, instead of trying to reuse cache regions defined elsewhere in the application.
The next step is to configure the cache regions settings, in file your-cache-config.xml
:
<?xml version="1.0" ?>
<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
updateCheck="false"
xsi:noNamespaceSchemaLocation="ehcache.xsd" name="yourCacheManager">
<diskStore path="java.io.tmpdir"/>
<cache name="yourEntityCache"
maxEntriesLocalHeap="10000"
eternal="false"
overflowToDisk="false"
timeToLiveSeconds="86400" />
<cache name="org.hibernate.cache.internal.StandardQueryCache"
maxElementsInMemory="10000"
eternal="false
timeToLiveSeconds="86400"
overflowToDisk="false"
memoryStoreEvictionPolicy="LRU" />
<defaultCache
maxElementsInMemory="10000"
eternal="false"
timeToLiveSeconds="86400"
overflowToDisk="false"
memoryStoreEvictionPolicy="LRU" />
</ehcache>
If no cache settings are specified, default settings are taken, but this is probably best avoided. Make sure to give the cache a name by filling in the name
attribute in the ehcache
element.
Giving the cache a name prevents it from using the default name, which might already be used somewhere else on the application.
Using the second level cache
The second level cache is now ready to be used. In order to cache entities, annotate them with the @org.hibernate.annotations.Cache
annotation:
@Entity
@Cache(usage=CacheConcurrencyStrategy.READ_ONLY,
region="yourEntityCache")
public class SomeEntity {
...
}
Associations can also be cached by the second level cache, but by default this is not done. In order to enable caching of an association, we need to apply @Cache
to the association itself:
@Entity
public class SomeEntity {
@OneToMany
@Cache(usage=CacheConcurrencyStrategy.READ_ONLY,
region="yourCollectionRegion")
private Set<OtherEntity> other;
}
Using the query cache
After configuring the query cache, by default no queries are cached yet. Queries need to be marked as cached explicitly, this is for example how a named query can be marked as cached:
@NamedQuery(name="account.queryName",
query="select acct from Account ...",
hints={
@QueryHint(name="org.hibernate.cacheable",
value="true")
}
})
And this is how to mark a criteria query as cached:
List cats = session.createCriteria(Cat.class)
.setCacheable(true)
.list();
The next section goes over some pitfalls that you might run into while trying to setup these two caches. These are behaviors that work as designed but still can be surprising.
Pitfall 1 - Query cache worsens performance causing a high volume of queries
There is an harmful side-effect of how the two caches work, that occurs if the cached query results are configured to expire more frequently than the cached entities returned by the query.
If a query has cached results, it returns a list of entity Id's, that is then resolved against the second level cache. If the entities with those Ids where not configured as cacheable or if they have expired, then a select will hit the database per entity Id.
For example if a cached query returned 1000 entity Ids, and non of those entities where cached in the second level cache, then 1000 selects by Id will be issued against the database.
The solution to this problem is to configure query results expiration to be aligned with the expiration of the entities returned by the query.
Pitfall 2 - Cache limitations when used in conjunction with @Inheritance
It is currently not possible to specify different caching policies for different subclasses of the same parent entity.
For example this will not work:
@Entity
@Inheritance
@Cache(CacheConcurrencyStrategy.READ_ONLY)
public class BaseEntity {
...
}
@Entity
@Cache(CacheConcurrencyStrategy.READ_WRITE)
public class SomeReadWriteEntity extends BaseEntity {
...
}
@Entity
@Cache(CacheConcurrencyStrategy.TRANSACTIONAL)
public class SomeTransactionalEntity extends BaseEntity {
...
}
In this case only the @Cache
annotation of the parent class is considered, and all concrete entities have READ_ONLY
concurrency strategy.
Pitfall 3 - Cache settings get ignored when using a singleton based cache
It is advised to configure the cache region factory as a EhCacheRegionFactory
, and specify an ehcache configuration via net.sf.ehcache.configurationResourceName
.
There is an alternative to this region factory which is SingletonEhCacheRegionFactory
. With this region factory the cache regions are stored in a singleton using the cache name as a lookup key.
The problem with the singleton region factory is that if another part of the application had already registered a cache with the default name in the singleton, this causes the ehcache configuration file passed via net.sf.ehcache.configurationResourceName
to be ignored.
Conclusion
The second level and query caches are very useful if set up correctly, but there are some pitfalls to bear in mind in order to avoid unexpected behaviors. All in all it's a feature that works transparently and that if well used can increase significantly the performance of an application.
Please let us know in the comments bellow your own experience and pitfalls you have encountered. Thanks for reading.
Useful Links
This blog post is a well-known reference to the inner details of the Hibernate second level and query caches - Truly Understanding the Second-Level and Query Caches
Opinions expressed by DZone contributors are their own.
Comments