Effective Usage of Oracle Row-Locking With Spring JDBC
You may be thinking, "Is it *that* hard to use Oracle row-locking without any issues in an enterprise-level application?" Unfortunately yes, it’s.
Join the DZone community and get the full member experience.
Join For FreeDatabase row-locking is one of the great features provided by RDBMS vendors to manage synchronization of concurrent data processing activities on a given table. Although it’s a great feature, it can lead to critical scenarios like your system being stuck due to stale locks on the database side.
In this blog post, I’m going to explain how to use Spring JDBC with the row-locking feature of Oracle database to maintain the synchronization of concurrent data processing tasks that are expected to transform some data and persist it back to the database.
If you have already done it, believe me, you are going to learn how to do it correctly without getting any stale locks and how to troubleshoot stale locks (for incorrect usage of row-locking) at the Oracle side.
Background
As the example use case for this blog post, let’s consider that we have a database table with message data that should be processed concurrently using a Java thread pool. Threads in this thread pool will periodically poll the database to select possible candidate rows to process. While processing data, it will transform the message content as well as change the state of the message to PROCESSED
to mark it as a completed message.
In the possible candidate row selection criteria for processing, there are some conditions that can be included in the SQL statement itself via the WHERE
clause, but there are some other conditions which cannot be included in the SQL query — to make it easy, let’s assume that these additional conditions evaluation logic is encapsulated using a method called isValidToProcess()
. Our example message table schema would be like this:
CREATE TABLE MSG_DATA
(
MSG_ID INTEGER PRIMARY KEY NOT NULL,
MSG_STATUS INT NOT NULL,
MSG_CONTENT CLOB,
PROC_CONTENT CLOB,
PROC_TIME TIMESTAMP,
VENDOR_ID INT NOT NULL
);
MSG_STATUS
refers to an enum which represents the status of the message life cycle — whether it’s processed or not. For this example, let’s consider that we should process the message data with status value 1 and once it’s processed, we should set that value to 2 to mark it as a processed row. The VENDOR_ID
value will be used to do the filtering logic in the isValidToProcess()
method.
So our execution flow would be like this:
- Select message IDs to process based on the
MSG_STATUS
value 1. - Go through each message ID one by one and get row lock for the respective message ID in each iteration.
- While going through each message data with row lock, check whether it’s a valid candidate to process based on the
isValidToProcess()
method. If not, ignore that row and release the row lock. - If it’s a valid candidate, then process the data in the
MSG_CONTENT
column. - Save processed message content in the
PROC_CONTENT
row. - Update the
MSG_STATUS
column value to 2 to mark it as a processed message. - Release the row lock.
Okay, that’s enough buffing! Let’s go to some interesting implementation stuff.
Implementation
First of all, we should declare a data source to access data via Spring JDBC. For that, let's configure our database as shown below in our Spring configuration:
<bean id="dataSource" class="oracle.jdbc.pool.OracleDataSource" destroy-method="close">
<property name="URL" value="${oracle-jdbc-url}" />
<property name="user" value="${oracle-jdbc-user}"/>
<property name="password" value="${oracle-jdbc-password}"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="connectionCacheName" value="oracle-cache"/>
<property name="connectionCacheProperties">
<props>
<prop key="MinLimit">2</prop>
<prop key="MaxLimit">20</prop>
<prop key="InitialLimit">5</prop>
<prop key="ValidateConnection">true</prop>
<prop key="ConnectionWaitTimeout">5000</prop>
</props>
</property>
<property name="connectionProperties">
<props>
<prop key="ReadTimeout">5000</prop>
</props>
</property>
</bean>
Why do we need the data source? There are many reasons for that, but I’m not going into the details of that in this blog. In simple terms, having a data source together with Spring JDBC will make our lives easy by making it simple to access the database and execute SQL queries without worrying about the underlying SQL connections and resource handling. Declaring it as a Spring Bean in our Spring configuration makes it accessible within anywhere in the spring application and only creates one instance of the data source with a configured number of connections — and of course, pooling/caching them as we have configured.
Then, let’s write our Java code using this data source. First, we should first select the possible message ID candidates to process:
JdbcTemplate jdbcTemplate = new JdbcTemplate(dataSource);
List<Long> msgIds;
try {
msgIds = jdbcTemplate.queryForList("SELECT MSG_ID FROM MSG_DATA WHERE MSG_STATUS = 1", Long.class);
} catch (Exception e) {
// handle exception
return;
}
if (msgIds.isEmpty()) {
// Create required log if required
return;
}
In above code snippet, we first create a jdbcTemple
using the data source and execute the select query to get the message list based on the msg_status
column value.
Then, we should go through each message ID and process them one by one after acquiring a row lock. So how are we going to acquire a row lock? It’s quite easy; we just need to modify our SQL query a little bit. However, the way you modify your SQL query will vary depending on your underlying database. In my case, for Oracle, you need to add a FOR UPDATE
suffix to your SQL query to get a row lock. (Almost all RDBMSs provides this feature — it’s just a matter of changing your SQL query based on the underlying database.)
for (final Long msgId : msgIds) {
try {
// row lock
PreparedStatementCreatorFactory pscf = new PreparedStatementCreatorFactory(
"SELECT MSG_ID , MSG_STATUS, MSG_CONTENT, PROC_CONTENT, PROC_TIME, VENDOR_ID " +
"FROM MSG_DATA WHERE MSG_ID = ? AND MSG_STATUS = 1 FOR UPDATE",
new int[] {Types.INTEGER});
pscf.setUpdatableResults(true);
RowCallbackHandler rch = new RowCallbackHandler() {
@Override
public void processRow(ResultSet resultSet) throws SQLException {
// Handle the processing logic to process the row with the row lock
if (!isValidToProcess(resultSet.getInt("VENDOR_ID"))) { // Check whether it's possible candidate based on the vendor id
resultSet.updateRow();
return;
}
Blob processedData = processMsg(resultSet.getBlob("MSG_CONTENT")); // process msg data
resultSet.updateBlob("PROC_CONTENT" , processedData); //update column PROC_CONTENT with processed data
resultSet.updateInt("MSG_STATUS", 2);
resultSet.updateRow(); //flushing all the updates to the table and release the row lock
}
};
jdbcTemplate.query(pscf.newPreparedStatementCreator(new Object[] {msgId}), rch);
} catch (Exception e) {
// Handle the exception
return;
}
}
Let me explain above code snippet. I believe that the prepared statement and jdbcTemplate
are quite familiar for a normal Java developer, so I’m not going to explain that. First, I modified the SQL statement to add msg id
to the where
clause and added the FOR UPDATE
phrase to the query to specify that we are going to do an update for the selected rows from the SELECT
query. Yeah, if you are going to do an update after reading a set of values, you must get a row lock for that. Oracle is intelligent enough to do that internally! Then, you need to say that you need an updatable result set as the output. For that, you have to call setUpdatableResults()
with argument true
. One important thing is that here, we are getting a write lock on the row, so the row will be available for all the read-only operations while you are executing the FOR UPDATE
query.
The next main thing you need to do is create a RowCallbackHandler
to handle row-locking logic to be executed with the data that will be received once you have acquired the row lock. In the processRow
method of the row callback handler, you have to implement the complete processing logic for the selected row. Within this block, you can access (read and write) all the columns that have been selected from the select statement. For my example, first I’m evaluating the isValidToProcess()
method based on the VENDOR_ID
and then calling the processMsg()
method with the msg_content
column clob
value to do the required processing for the message content. The resulting processed message content will be saved as a clob
in the PROC_CONTENT
column and update the MSG_STATUS
column value to 2. Remember, this is for marking this as a processed row. At the end of the processRow
method, you must call the resultSet.updateRow()
method to complete the data processing for this row and release the row lock. After you call the updateRow()
method, you cannot do further modifications for the resultSet and your locked row will be accessible (writable) to outside.
Finally, you need to call the jdbcTemplate.query()
API providing the prepared statement factory, and the msg id
to be used for the where
clause and row callback handler instance.
OK, assuming that we are done, now you can run the above code block using your thread pool and let it be executed by multiple threads at the same time.
Although I have explained it as if there isn’t any issue with the above code block, believe me, there are few mistakes that I have done intentionally to make it more understandable.
What are those mistakes?
If you think carefully, when multiple threads compete to select a row to process, there can be a case that two (or more) threads are competing for the same row. Then, one thread will get the access to a row lock and the other thread will have to wait until the first one completes its processing and releases the row lock. This is not the most effective way to achieve our expected level of throughput. We need a way to specify that if some row is already locked, we want to skip that row without waiting until it is released for locking.
How do we implement this? Fortunately, Oracle has given a cool feature to ignore already-locked rows when you are asking for a row lock. You just have to add the SKIP LOCKED
part for your query. Some other database vendors also provide this feature, but not all. As an example, MSSQL has this feature, but MySQL doesn’t have this as an inbuilt feature.
Then what is the next problem? Did you run above code block on your own? If you have run the above code block on your own with a higher concurrency level, then you will get an exception like below:
java.sql.SQLException: Value conflicts occurs
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:83)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:605)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:639)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:664)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:684)
...
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.sql.SQLException: Value conflicts occurs
at oracle.jdbc.driver.UpdatableResultSet.executeUpdateRow(UpdatableResultSet.java:4165)
at oracle.jdbc.driver.UpdatableResultSet.updateRow(UpdatableResultSet.java:2968)
...
at org.springframework.jdbc.core.JdbcTemplate$RowCallbackHandlerResultSetExtractor.extractData(JdbcTemplate.java:1492)
at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:651)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:589)
... 13 more
So what is this? What is the reason? And how do we fix it?
This is a famous exception (see here and here for just a few examples) that can happen due to data inconsistency at the underlying database layer. You may be wondering, Why are we getting this error? Did we do something related to data consistency?
Of course not! We haven’t done something like that intentionally. Logically, the above code block is completely correct from a data consistency point of view. The issue, the underlying database layer, is trying to do some advanced and intelligent things for us to make it more effective. In the above example, although we have asked only for one row specifically (via selecting from the primary key — msg_uuid
), the database layer will assume that we will access the adjacent rows after this. Based on that assumption, while giving the requested row for us, it will try to keep kind of weak references (but not locked) to those rows as well. Then, when we are trying to commit our changes after the updateRow()
call, there is a possibility that those adjacent rows will be already updated by other threads. Because of that, it sees this as a data inconsistency (some data in adjacent rows is different than the read data) and throws a “Value conflicts occur” exception.
Poor us! Although we have done it correctly, the underlying database layer is too good to handle that! How are we going to resolve this? It’s not that difficult — you just have to configure two parameters. The first thing is you must configure your result set type to be a forward-only one. You can do this as shown below via the prepared statement creator factory,
pscf.setResultSetType(ResultSet.TYPE_FORWARD_ONLY);
Next, you have to set maxRows
to 1 in the jdbcTemplate
to inform to the database layer that I’m only looking for one row (please don’t try to do smart things for me!). You can do it as shown below:
jdbcTemplate.setMaxRows(1);
That’s it! Now, you are almost done. Now you should be able to run the above code using multiple threads without any concurrency issues.
However, there is one more issue here. (Oh come on, another one?!) Let me explain what it is. If by any chance, our isValidToProcess()
method returns false, what will be the outcome? It’s simple. We will call the resultSet.updateRow()
method to release the lock and return. All good, isn’t it?
Unfortunately, it’s not (note that this behavior can be vary depending on the Oracle client jar, Oracle database version, and the connection caching parameters that you are using). If you run this code block with the last two improvements related to the value conflict exception and you have data that returns false from the isValidToProcess()
method, then you will see there will be some stale locks at the Oracle side after some time. I have experienced this in one of our projects, and believe me, it’s one of the most critical and difficult issues to troubleshoot. Basically, you only see that some of the rows in your table will go inaccessible for further updates without any exceptions in your logs. The interesting thing is, as per the API documentation of Spring JDBC and Oracle, you have done your implementation correctly.
So wouldn’t it be a tough one? How are you going to troubleshoot this? And then, what is the fix for this?
For troubleshooting, I suspected that this was due to our row-locking code because there wasn't anything suspectable at that time. To confirm that, I have run following queries as sysdba (you can log in as sysdba by executing the sqlplus / as sysdba
command as an Oracle user).
First, run this command to get a more readable output from sqlplus.
SET PAGESIZE 100 LINESIZE 200
COL locked_object FORMAT A20
COL program FORMAT A18
COL session_logon_time FORMAT A20
COL transaction_start_time FORMAT A25
COL locked_mode FORMAT A10
ALTER SESSION SET nls_date_format='MM/DD/YY HH24:MI:SS';
Then, run the following query to check existing locks:
SELECT gs.sid,
gs.serial#,
gs.program,
gs.logon_time sesssion_logon_time,
gt.start_time transaction_start_time,
gs.sql_exec_start,
gs.prev_exec_start,
gs.status session_status,
gt.status transaction_status,
o.object_name locked_object,
DECODE(lo.locked_mode, 0, 'NONE',
1, 'NULL',
2, 'ROW_S',
3, 'ROW_X',
4, 'TAB_S',
5, 'TAB_SSX',
6, 'TAB_X', 'UNKNOWN') locked_mode
FROM gv$session gs, gv$transaction gt, dba_objects o, gv$locked_object lo
WHERE gs.inst_id = gt.inst_id
AND gs.inst_id = lo.inst_id
AND gs.saddr = gt.ses_addr
AND gs.sid = lo.session_id
AND o.object_id = lo.object_id
AND gt.status = 'ACTIVE';
The above command will check for stale locks at the database side, and if there are any locks, it will get information about those locks — such as locked object, lock start time, locked mode, etc. If you are lucky and doesn’t have any stale locks, you shouldn’t get any rows selected from the above command. If there are any stale locks, you should see something like this:
Now, we know we are creating stale locks on the database side. How do we avoid that? What is the problem here? In my case, I have to go deep and understand how Oracle JDBC driver handles an updateable result set. Here also, Oracle tries to do some smart things to improve its performance. If we don’t do any updates to the result set object, Oracle assumes that there is nothing to flush to the database side since there aren't any changes to the originally fetched result set object. It silently ignores our resultSet.updateRow()
call without doing anything related to resource handling of the updatable result set. The downside of this is that since we are on a row lock, our acquired lock also won’t release and it will be there as long as our program runs (in fact, as long as the connection that created the row lock is alive — this can also be vary depending on the configured timeout parameters such as lock timeout, etc. Fortunately, we can overcome this issue with a minor hack! The only thing we need to do is before calling updateRow()
method, we have to add the following line:
resultSet.updateInt("MSG_ID", resultSet.getInt("MSG_ID"));
resultSet.updateRow();
This will make sure we always update our resultSet
object before calling the updateRow()
call and then Oracle cannot silently ignore our updateRow()
method call. It makes sure all of the acquired locks and resources will be released after the updateRow()
call.
That’s it! This time, we are completely done. To make it more robust, let’s wrap the complete logic in the processFlow
method inside a try-catch block and then move above two lines (updateRow
and hacking line) to the finally
block to make sure we will be calling those two lines no matter what happens within the processFlow
method. The final code will look as shown below. This coding pattern is tested with enterprise-level deployment and it’s running like a charm for a long time with a considerable load:
JdbcTemplate jdbcTemplate = new JdbcTemplate(dataSource);
List<Long> msgIds;
try {
msgIds = jdbcTemplate.queryForList("SELECT MSG_ID FROM MSG_DATA WHERE MSG_STATUS = 1", Long.class);
} catch (Exception e) {
// handle exception
return;
}
if (msgIds.isEmpty()) {
// Create required log if required
return;
}
for (final Long msgId : msgIds) {
try {
// row lock
PreparedStatementCreatorFactory pscf = new PreparedStatementCreatorFactory(
"SELECT MSG_ID , MSG_STATUS, MSG_CONTENT, PROC_CONTENT, PROC_TIME, VENDOR_ID " +
"FROM MSG_DATA WHERE MSG_ID = ? AND MSG_STATUS = 1 FOR UPDATE SKIP LOCKED",
new int[] {Types.INTEGER});
pscf.setUpdatableResults(true);
pscf.setResultSetType(ResultSet.TYPE_FORWARD_ONLY);
RowCallbackHandler rch = new RowCallbackHandler() {
@Override
public void processRow(ResultSet resultSet) throws SQLException {
try {
// Handle the processing logic to process the row with the row lock
if (!isValidToProcess(resultSet.getInt("VENDOR_ID"))) { // Check whether it's possible candidate based on the vendor id
return;
}
Clob processedData = processMsg(resultSet.getClob("MSG_CONTENT")); // process msg data
resultSet.updateClob("PROC_CONTENT" , processedData); //update column PROC_CONTENT with processed data
resultSet.updateInt("MSG_STATUS", 2);
} catch (Exception e) {
// Handle any exception while handling the result set
} finally {
resultSet.updateInt("MSG_ID", resultSet.getInt("MSG_ID")); // Mock update to make sure resultSet is always an updated one
resultSet.updateRow(); //flushing all the updates to the table and release the row lock
}
}
};
jdbcTemplate.setMaxRows(1);
jdbcTemplate.query(pscf.newPreparedStatementCreator(new Object[] {msgId}), rch);
} catch (Exception e) {
// Handle the exception
return;
}
}
I know, I know. Now, you're thinking, What the heck is this? Is it that hard to use Oracle row-locking without any issues in an enterprise-level application? Unfortunately yes, it’s.
I hope you enjoyed my blog and learned new things on the effective usage of Oracle row-locking with Spring JDBC for concurrent data processing without having any stale locks.
Feel free to share your views and comments on this implementation pattern. I’m more than happy to discuss further on this.
Opinions expressed by DZone contributors are their own.
Comments