Developing Saga Participant Code for Compensating Transactions
This article shows a full code example of a microservices application involving a saga, including participant compensation logic.
Join the DZone community and get the full member experience.
Join For FreeThe saga pattern is used to provide data integrity between multiple services and to do so for potentially long-running transactions. There are many, cursory as they tend to be, blogs written on sagas and long-running transactions. In this piece, I will not go into detail comparing the advantages and disadvantages of sagas with XA two-phase commit (2PC), distributed locking, etc., and will simply state that XA and 2PC require distributed locks (con) which manage ACID properties such that the user can simply execute rollback or commit (pro) whereas sagas use local transactions only and so do not require distributed locks (pro) but require the user to implement compensation, etc. logic (con). As Teddy Roosevelt said, “Nothing worth having comes easy.”
What I will do is show a full code example of a microservices application involving a saga, including participant compensation logic, which can be found at this repo.
You can also check out this video that walks through corresponding content:
There are some existing source code examples out there that show the coordinator/orchestrator side of saga functionality and related participant callbacks, however, little to no examples of the participant-side handling which is generally the more intensive, error-prone side of the equation, taking an estimated 80% of the development and testing cycles of a saga application implementation, and this is what I intend to remedy with this blog and source code.
I will use the orchestration saga model (where a devoted orchestrator coordinates the saga for the participants) as opposed to the choreography saga model (where participants coordinate the saga between themselves) and will use the Eclipse MicroProfile LRA (Long Running Actions) specification as it is the most detailed and modern interoperable standard that exists and also essentially the only one. (I will use the terms “saga” and “LRA” interchangeably in the context of the protocol flow.)
I will also use a bank account transfer scenario rather than the widely (over) used travel booking scenario. I will not delve into what scenarios may be more or less appropriate for sagas vs XA and 2PC. The main reasons I am using the classic account transfer scenario is both to give an apples-to-apples comparison of the application behavior and logic and to demonstrate a number of different aspects. For example, an increase and decrease in balance within the same saga rather exclusively a decrease in inventory as is the case with the travel agency example, the resulting behavior in order to maintain acceptable/safe integrity of reads and writes despite the well known dangers of eventual consistency, etc.
Before diving into the app and code, let’s quickly go over the basic LRA/saga flow as illustrated below and mention the corresponding LRA annotations while we are at it. Note the color coding of all the calls made.
All of the blue and purple calls are made by the LRA client and coordinator implicitly, i.e., they are not explicitly made by the application code (application code calls are in red).
The Transfer Service receives a request to transfer money from one account to another.
The “transfer” method that is called is annotated with @LRA(value = LRA.Type.REQUIRES_NEW, end = false), therefore, the underlying LRA client library makes a call to the Coordinator/Orchestrator service which creates a new LRA/saga and passes the LRA/saga id back to the Transfer service.
The Transfer Service makes a call to the (Bank)Account (for account 66) service to make the withdraw call. The LRA/saga id is propagated as a header as part of this call.
The “withdraw” method that is called is annotated with @LRA(value = LRA.Type.MANDATORY, end = false), therefore, the underlying client library makes a call to the Coordinator/Orchestrator service which recognizes the LRA/saga id and enlists/joins the Account Service endpoint (address) to the LRA/saga started by the Transfer Service. This endpoint has a number of methods including the Complete and Compensate methods that will be called when the saga/LRA is terminated/ended.
The “withdraw” method is executed and control returns to the Transfer Service.
This is repeated with a call from the Transfer Service to the Account service (for account 67) to make the deposit call.
Depending on the returns from the Account Service calls, the Transfer Service determines if it should close or cancel the saga/LRA. Close and cancel being somewhat analogous to commit or rollback.
The Transfer Service issues the close or cancel call to the Coordinator (I will get into more details on how this is done implicitly when looking closer at the application).
The Coordinator in turn issues complete (in the case of close) or compensate calls on the participants that were joined in saga/LRA previously.
Let's set up and run the app and then look closer at the code.
The following two steps are related to setup only. The coordinator and transfer and account services can be set up in a number of ways:
Running as standalone processes; i.e., straight Java commands.
Running as/in containers.
Running as containers in Kubernetes environment. We show this in the repos, and it's quite straightforward, but we do not do so in this blog in order to avoid the need to use a Kubernetes environment.
A mix. This is what we do here where the Coordinator is run as/in a container and the transfer and account services are run as Java commands. Therefore, in steps 2 and 3 we are simply configuring the addresses for the LRA callbacks, etc. accordingly.
- We will of course first need a Coordinator and we will use the Oracle Transaction Manager for Microservices (MicroTx). As this blog is focused on compensation logic, although security is of course extremely important in its own right, for simplicity's sake we will disable various security aspects and this can be done in the coordinator’s tcs.yaml configuration file as I’ve done in the version found in the root of the repos. You can then start the coordinator using the docker image and config file with the following command (this assumes you are running from the root dir of the repos where the tcs.yaml file is)...
docker container run --name otmm -v "$(pwd)":/app/config -w /app/config -p 9000:9000/tcp --env CONFIG_FILE=tcs.yaml --add-host docker.for.mac.host.internal:host-gateway -d container-registry.oracle.com/database/otmm:latest
- As mentioned in the first step, we will run the transfer and account services using Java directly rather than using a container as we do with Coordinator so that you can easily examine and tweak it as you like. Therefore, let’s find out the IP address the coordinator container will use to call the localhost where our services (and specifically the LRA/saga participant endpoints) run. We can do this by issuing
ifconfig | grep 'inet 192'| awk '{ print $2}
We then edit the address values in saga-examples/transfer/src/main/resources/application.yaml with the IP value returned like so...
account: deposit: url: http://192.168.205.1:8080/deposit/deposit withdraw: url: http://192.168.205.1:8080/withdraw/withdraw transfer: cancel: url: http://localhost:8081/cancel process: url: http://192.168.205.1:8081/processcancel confirm: url: http://localhost:8081/confirm process: url: http://192.168.205.1:8081/processconfirm
These are the addresses the transfer service uses to call the account service as well as the addresses it uses to call itself when ending the LRA/saga (we'll elaborate on that aspect latter).
- The database used is configured in saga-examples/account/src/main/resources/application.yaml like so...
spring: # ... jpa: hibernate: ddl-auto: update # ... datasource: url: jdbc:oracle:thin:@obaasdevdb_tp?TNS_ADMIN=/Users/pparkins/Downloads/Wallet_OBAASDEVDB username: account password: Welcome1234##
Aside from a valid database and account, no other preparation is required as the JPA ddl-auto value is set to update meaning all of the account, journal, etc. tables will be created automatically.
- Now we can build the application by simply running
mvn clean package
from the repos root dir (i.e., saga-examples dir). - Now run the transfer service (which runs on port 8081) and account service (which runs on port 8080) by running
java -jar target/transfer.jar
from the saga-examples/transfer directory in one terminal and by runningjava -jar target/account.jar
from the saga-examples/account directory in another terminal. - We will now make a few requests to create bank accounts with balances
curl -X POST "http://localhost:8080/api/v1/createAccountWith1000Balance" -H "Content-Type: application/json" -d '{ "AccountCustomerId": "testcustomerid1" }'
curl -X POST "http://localhost:8080/api/v1/createAccountWith1000Balance" -H "Content-Type: application/json" -d '{ "AccountCustomerId": "testcustomerid2" }'
- Note the accountIds generated from the commands and use them to check the balances are indeed $1000
curl http://localhost:8080/api/v1/account/66 | json_pp
curl http://localhost:8080/api/v1/account/67 | json_pp
- Now issue a transfer from the first account to the second for $100
curl -X POST "http://localhost:8080/transfer?fromAccount=66&toAccount=67&amount=100"
- Notice the "transfer status:withdraw succeeded deposit succeeded" return.
- Check the balances of both accounts as done previously and notice the expected change from the transfer.
- Notice in the transfer service the successful calls to withdraw and deposit as well as the close/confirm call.
- Notice in the account service the successful withdraw and deposit calls and corresponding complete calls as well as the afterLRA calls.
- Notice the journal entries made by issuing curl http://localhost:8080/api/v1/journals (I'll elaborate on this in the next section).
- Now issue a transfer from the first account to the second for $1100
curl -X POST "http://localhost:8080/transfer?fromAccount=66&toAccount=67&amount=1100"
- Notice the "transfer status:withdraw failed: insufficient funds" return.
- Check the balances of both accounts as done previously and notice there is no change in either account, as expected, due to insufficient funds.
- Notice in the transfer service the failure response from the call to withdraw as well as the successful cancel call.
- Notice in the account service the unsuccessful withdraw calls and corresponding compensate call as well as the afterLRA call.
- Notice the journal entries made by issuing curl http://localhost:8080/api/v1/journals (again, I'll elaborate on this in the next section).
Now a closer look at the code and what is necessary for failure handling, compensation, etc.
First, we'll look at the Account service.
- Account JPA Model and Repository provide basic operations for a bank account
- Journal JPA Model and Repository provide three purposes...
- Ledger for transfer operations
- Journal for tracking application changes made in an LRA/saga.
- Store for the state of the LRA/saga itself.
These would generally be broken into different tables/repositories and are also a good use case for blockchain, etc. but I've combined them here for simplicity's sake.
*Note that the LRA Id is (re)used as the transferId.
- AccountAndJournalAdminService provides the CRUD admin operations for creating and querying accounts and journal entries.
- AccountTransferDAO contains the common code used by AccountsWithdrawService and AccountsDepositService that does all of the actual data manipulations including journal maintenance (as mentioned above) and account maintenance.
- AccountsWithdrawService methods
- withdraw:
- Persists the LRA state of ParticipantStatus.Active to the journal (entrance into the method indicates the @LRA enlistment/join was successful)
- Checks whether the account is valid and whether sufficient funds exist for transfer. If not a failure message is returned to the transfer service.
- Updates the account (reduces the balances) and creates a journal entry describing the withdraw/update. These two actions must be done in the same local transaction and so are called in an updateAccountBalance method demarcated by the @Transactional Spring annotation which uses the DataSourceTransactionManager to insure both repositories (account and transfer) use the same underlying JDBC connection (and thus atomic/local transaction). This journal entry will be used later during complete or compensate.
- complete:
- As the money was already withdrawn from the account, in the case of complete we only update the journal to reflect the state of the LRA as ParticipantStatus.Completed or ParticipantStatus.FailedToComplete (if something went wrong such as a lost connection to the database or some such). The method can be called multiple times with the same result (i.e., is idempotent) and so no further handling is necessary.
- compensate:
- This method first checks the state of the LRA and if it is already ParticipantStatus.Compensating or ParticipantStatus.Compensated the method immediately returns in order to avoid duplicate processing that could occur otherwise in recovery scenarios.
- Otherwise, the journal for the withdraw activity of this LRA/saga is queried and the changes that were recorded in the journal are applied (i.e., the money is put back into the account) and the journal is updated. These 3 activities must be conducted in the same local transaction and so, as was done for the withdraw method/operation, a devoted doCompensationWork method, demarcated with @Transactional, is used to encapsulated these activities and make them atomic.
- status:
- This method is called by the coordinator to determine the state of the LRA participant and so the journal is queried to get this state and return it. For example, the coordinator can determine if the participant is compensating, compensated, completing, or completed and in any of these cases will not make duplicate calls to complete or compensate.
- afterLRA:
- This method is called after an LRA is finished and so the participant can use it to clean up any caches, etc. In the case of the successful ending status of the LRA, we could purge the journal of the LRA here by doing
if (isLRASuccessfullyEnded(status)) journalRepository.delete(getJournalForLRAid(lraId, journalType))
, however, we keep the entry for analysis/auditing after the fact.
- This method is called after an LRA is finished and so the participant can use it to clean up any caches, etc. In the case of the successful ending status of the LRA, we could purge the journal of the LRA here by doing
- withdraw:
- AccountsDepositService methods
- deposit:
- Persists the LRA state of ParticipantStatus.Active to the journal (entrance into the method indicates the @LRA enlistment/join was successful)
- Checks whether the account is valid and whether sufficient funds exist for transfer. If not a failure message is returned to the transfer service.
- Unlike withdraw, we do not actually update the account balance and instead just create a journal entry describing the deposit/update. This shows another key aspect/difference in saga applications which is the application logic to insure eventually consistent (i.e. potentially inconsistent) reads and writes do not cause an issue. In this financial scenario, it would be acceptable if funds were temporarily not available in both accounts but would not be acceptible if funds were available in both and thus the difference in behavior between the withdraw and deposit methods that we would not (need to) explicitly handle in the case of XA and 2PC. This journal entry will be used later during complete or compensate.
- complete:
- This method first checks the state of the LRA and if it is already ParticipantStatus.Compensating or ParticipantStatus.Compensated the method immediately returns in order to avoid duplicate processing that could occur otherwise in recovery scenarios.
- Otherwise, the journal for the deposit activity of this LRA/saga is queried and the changes that were recorded in the journal are applied (i.e., the money is put into the account) and the journal is updated. These 3 activities must be conducted in the same local transaction and so, as was done for the withdraw method/operation, a devoted doCompensationWork method, demarcated with @Transactional, is used to encapsulated these activities and make them atomic.
- compensate:
- As the money was never actually deposited in the account, in the case of compensate we only update the journal to reflect the state of the LRA as ParticipantStatus.Completed or ParticipantStatus.FailedToComplete (if something went wrong such as a lost connection to the database or some such). The method can be called multiple times with the same result (i.e., is idempotent) and so no further handling is necessary.
- status:
- This method is called by the coordinator to determine the state of the LRA participant and so the journal is queried to get this state and return it. For example, the coordinator can determine if the participant is compensating, compensated, completing, or completed and in any of these cases will not make duplicate calls to complete or compensate.
- afterLRA:
- This method is called after an LRA is finished and so the participant can use it to clean up any caches, etc. In the case of the successful ending status of the LRA, we could purge the journal of the LRA here by doing
if (isLRASuccessfullyEnded(status)) journalRepository.delete(getJournalForLRAid(lraId, journalType))
, however, we keep the entry for analysis/auditing after the fact.
- This method is called after an LRA is finished and so the participant can use it to clean up any caches, etc. In the case of the successful ending status of the LRA, we could purge the journal of the LRA here by doing
- deposit:
Now let's look at the Transfer service.
- TransferService
- transfer:
- This is the method that initiates the LRA/saga due to it's
@LRA(value = LRA.Type.REQUIRES_NEW, end = false)
annotation. - It then calls the withdraw Account service first and, if the withdraw is successful, calls the deposit Account service (as discussed in the LRA/saga flow earlier, the LRA/saga id is propagated at the Account services are enlisted/joined in the LRA initiated by the Transfer serivce). There is no ordering requirement as far as participant calls in this bank transfer scenario (which is the ideal design in general), however, ordering of participant calls can have a bearing on business transactions (i.e., sagas) for a number of reasons including optimizations. The LRA spec does not provide a mechanism for indicating ordering (in particular complete and compensate calls are not guaranteed to be called in a particular order) though participants can be coded in such a way to insure this.
- Depending on the return from the withdraw and deposit calls, a close or cancelcall is made implicitly to the coordinator (it is possible to make this call explicitly, however, the API to do is not spec standard and so I use a portable, implicit, annotation based approach in the app). The behavior here is a bit unique when compared to, eg, JTA transactions and their demarcation for termination/ending of a transaction despite their both having the same transaction types/attributes (REQUIRESNEW, MANDATORY, etc.) for annotations. This is due to a few reasons...
- While JTA transactions trigger rollback implicitly as a result of an exception being thrown from the method that began the transaction, LRA transactions involving communication over the network and more specifically Rest can only be triggered by HTTP response codes which do not map to application/transaction level logic. The HTTP response codes that should trigger a LRA to cancel/compensate can be indicated via the cancelOn and cancelOnFamily elements of the @LRA annotation.
- As such, the determination to close or cancel an LRA generally falls on the caller and the @LRA annotation provides an end attribute to facilitate this. This end boolean indicates whether the LRA should be terminated (i.e., if the coordinator close or cancel call should be made) upon exit of the existing method or not. The default is true which why you see it set to false in all but the methods meant to close or cancel the LRA (This is analogous to the Terminator object in other transaction protocols which allows the termination/ending of a transaction from an arbitrary location.)
- I implemented the Transfer service so that it can be called from an external client in order to end the LRA or it can end the LRA itself (rather than have yet another service that acts as a client) calling the Transfer service. I did this by providng a close method (which in turn calls processClose) and a cancel method (which in turn calls processCancel) in the Transfer service which the transfer method calls to end the LRA as appropriate (close for successful transfer, cancel for unsuccessful).
- This is the method that initiates the LRA/saga due to it's
- close/processClose:
- close is annotation with @LRA and a TxType of NOT_SUPPORTED thus the LRA is suspended but the LRA id (which as mentioned before is also the transferId) can be passed to the processClose method.
- processClose is annotated with
@LRA(value = LRA.Type.NOT_SUPPORTED)
As mentioned, the default value for end is true, and thus a call to the coordinator to close the LRA will be made when the method exits.
- cancel/processCancel:
- cancel is annotation with @LRA and a TxType of NOT_SUPPORTED thus the LRA is suspended but the LRA id (which as mentioned before is also the transferId) can be passed to the processCancel method.
- processCancel is annotated with
@LRA(value = LRA.Type.NOT_SUPPORTED, cancelOn = Response.Status.OK)
As mentioned, the default value for end is true, and cancelOn indicates the HTTP response code that will trigger a call to the coordinator to cancel the LRA, and thus the LRA will be cancelled/compensated when the method exits.
- transfer:
Co-Existence of JAX-RS and Spring Rest Within Spring Boot
While the LRA specification touches on the behavior of Non-JAX-RS participants in LRA sagas, there is additional/explicit work involved, and so in this example, while I use Spring Boot due to its popularity, I still use JAX-RS due to its direct and implicit support in MicroProfile. This being the case I will just provide the additional few mods necessary to run JAX-RS (specifically via Jersey) within Spring Boot.
- First, notice the JerseyConfig class which registers, in particular, the ServerLRAFilter and LRAParticipantRegistry need for the implementation and also the need to provide forwarding of unknown requests:
@Component @ApplicationPath("/") public class JerseyConfig extends ResourceConfig { public JerseyConfig() { register(AccountsDepositService.class); register(AccountsWithdrawService.class); register(io.narayana.lra.filter.ServerLRAFilter.class); register(new AbstractBinder(){ @Override protected void configure() { bind(LRAParticipantRegistry.class) .to(LRAParticipantRegistry.class); } }); property(ServletProperties.FILTER_FORWARD_ON_404, true); }
- Second, notice the application.yaml of account service and how spring.jersey.type must be set to "filter"
spring: jersey: type: filter
In this blog, I have shown some of the aspects involved in creating saga participant implementations. While it goes into more detail than other blogs, it is not a complete implementation, of course. I will expand upon different scenarios and features in future blogs, including:
- Embedding of state in application data. For example, in an order inventory scenario, the introduction of a "pending" status for an order (where only a "completed" or "failed" state may have existed previously) can be used in a similar way as the in-doubt status of an XA/2PC transaction.
- The use of the new Saga engine in the Oracle Database which provides support for messaging (i.e., event-driven sagas) and takes away the need to do a lot of the bookkeeping, de-duplication/idempotency, etc. logic discussed.
- The use of the new Escrow and Auto-compensating datatypes in the Oracle Database that handle the burden of the error-prone compensation logic discussed implicitly for the developer. A tremendous advantage!
I welcome any questions or feedback as far as further considerations, improvements, etc., and thank you for reading!
Opinions expressed by DZone contributors are their own.
Comments