Journey to Idempotency and Temporal Decoupling
Idempotency in HTTP means that the same request can be performed multiple times with the same effect as if it was executed just once. If you replace current state of some resource with new one, no matter how many times you do so, in the end state will be the same as if you did it just once. To give more concrete example: deleting a user is idempotent because no matter how many times you delete given user by unique identifier, in the end this user will be deleted. On the other hand creating new user is not idempotent because requesting such operation twice will create two users. In HTTP terms here is what RFC 2616: 9.1.2 Idempotent Methods has to say: 9.1.2 Idempotent Methods Methods can also have the property of "idempotence" in that [...] the side-effects of N > 0 identical requests is the same as for a single request. The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent. Temporal coupling is an undesirable property of a system where the correct behaviour is implicitly dependent on time dimension. In plain English, it might mean that for example system only works when all components are present at the same time. Blocking request-response communication (ReST, SOAP or any other form of RPC) require both client and server to be available at the same time, which is an example of this effect. Having basic understanding what these concepts mean, let's go through a simple case study - massively multiplayer online role-playing game. Our artificial use case is as follows: a player sends premium-rated SMS to purchase virtual sword inside game. Our HTTP gateway is called when SMS is delivered and we need to inform InventoryService, deployed on a different machine. Current API involves ReST and looks as follows: @Slf4j @RestController class SmsController { private final RestOperations restOperations; @Autowired public SmsController(RestOperations restOperations) { this.restOperations = restOperations; } @RequestMapping(value = "/sms/{phoneNumber}", method = POST) public void handleSms(@PathVariable String phoneNumber) { Optional maybePlayer = phoneNumberToPlayer(phoneNumber); maybePlayer .map(Player::getId) .map(this::purchaseSword) .orElseThrow(() -> new IllegalArgumentException("Unknown player for phone number " + phoneNumber)); } private long purchaseSword(long playerId) { Sword sword = new Sword(); HttpEntity entity = new HttpEntity<>(sword.toJson(), jsonHeaders()); restOperations.postForObject( "http://inventory:8080/player/{playerId}/inventory", entity, Object.class, playerId); return playerId; } private HttpHeaders jsonHeaders() { HttpHeaders headers = new HttpHeaders(); headers.setContentType(MediaType.APPLICATION_JSON); return headers; } private Optional phoneNumberToPlayer(String phoneNumber) { //... } } Which in turns generates request similar to this: > POST /player/123123/inventory HTTP/1.1 > Host: inventory:8080 > Content-type: application/json > > {"type": "sword", "strength": 100, ...} < HTTP/1.1 201 Created < Content-Length: 75 < Content-Type: application/json;charset=UTF-8 < Location: http://inventory:8080/player/123123/inventory/1 This is fairly straightforward. SmsController simply forwards appropriate data to inventory:8080 service by POSTing sword that was purchased. This service, immediately or after a while, returns 201 Created HTTP response confirming the operation was successful. Additionally link to resource is created and returned, so you can query it. One might say: ReST state of the art. However if you care at least a little about money of your customers and understand what ACID is (something that Bitcoin exchanges still have to learn: see [1], [2], [3] and [4]) - this API is too fragile and prone to errors. Imagine all these types of errors: your request never reached inventory server your request reached server but it refused it server accepted connection but failed to read request server read request but hanged server processed request but failed to send response server sent 200 OK response but it was lost and you never received it server's response was received but client failed to process it server's response was sent but client timed-out earlier In all these cases you simply get an exception on the client side and you have no idea what's the server's state. Technically you should retry failed requests, but since POST is not idempotent, you might end up rewarding gamer with more than one sword (in cases 5-8). But without retry you might loose gamer's money without giving him his precious artifact. There must be a better way. Turning POST to idempotent PUT In some cases it's surprisingly simple to convert from POST to idempotent PUT by basically moving ID generation from server to client. With POST it was the server that generated sword's ID and sent it back to the client in Location header. Turns out eagerly generating UUID on the client side and changing the semantics a bit plus enforcing some constraints on the server side is enough: private long purchaseSword(long playerId) { Sword sword = new Sword(); UUID uuid = sword.getUuid(); HttpEntity entity = new HttpEntity<>(sword.toJson(), jsonHeaders()); asyncRetryExecutor .withMaxRetries(10) .withExponentialBackoff(100, 2.0) .doWithRetry(ctx -> restOperations.put( "http://inventory:8080/player/{playerId}/inventory/{uuid}", entity, playerId, uuid)); return playerId; } The API looks as follows: > PUT /player/123123/inventory/45e74f80-b2fb-11e4-ab27-0800200c9a66 HTTP/1.1 > Host: inventory:8080 > Content-type: application/json;charset=UTF-8 > > {"type": "sword", "strength": 100, ...} < HTTP/1.1 201 Created < Content-Length: 75 < Content-Type: application/json;charset=UTF-8 < Location: http://inventory:8080/player/123123/inventory/45e74f80-b2fb-11e4-ab27-0800200c9a66 Why it's such a big deal? Simply put (no pun intended) client can now retry PUT request as many times as he wants. When server receives PUT for the first time, it persists sword in the database with client-generated UUID (45e74f80-b2fb-11e4-ab27-0800200c9a66) as primary key. In case of second PUT attempt we can either update or reject such request. It wasn't possible with POST because every request was treated as a new sword purchase - now we can track whether such PUT came before or not. We just have to remember to subsequent PUT is not a bug, it's an update request: @RestController @Slf4j public class InventoryController { private final PlayerRepository playerRepository; @Autowired public InventoryController(PlayerRepository playerRepository) { this.playerRepository = playerRepository; } @RequestMapping(value = "/player/{playerId}/inventory/{invId}", method = PUT) @Transactional public void addSword(@PathVariable UUID playerId, @PathVariable UUID invId) { playerRepository.findOne(playerId).addSwordWithId(invId); } } interface PlayerRepository extends JpaRepository {} @lombok.Data @lombok.AllArgsConstructor @lombok.NoArgsConstructor @Entity class Sword { @Id @Convert(converter = UuidConverter.class) UUID id; int strength; @Override public boolean equals(Object o) { if (this == o) return true; if (!(o instanceof Sword)) return false; Sword sword = (Sword) o; return id.equals(sword.id); } @Override public int hashCode() { return id.hashCode(); } } @Data @Entity class Player { @Id @Convert(converter = UuidConverter.class) UUID id = UUID.randomUUID(); @OneToMany(cascade = ALL, fetch = EAGER) @JoinColumn(name="player_id") Set swords = new HashSet<>(); public Player addSwordWithId(UUID id) { swords.add(new Sword(id, 100)); return this; } } Few shortcuts were made in code snippet above, like injecting repository directly to controller, as well as annotating is with @Transactional. But you get the idea. Also notice that this code is quite optimistic, assuming two swords with same UUID aren't inserted at exactly the same time. Otherwise constraint violation exception will occur. Side note 1: I use UUID type in both controller and JPA models. They aren't supported out of the box, for JPA you need custom converter: public class UuidConverter implements AttributeConverter { @Override public String convertToDatabaseColumn(UUID attribute) { return attribute.toString(); } @Override public UUID convertToEntityAttribute(String dbData) { return UUID.fromString(dbData); } } Similarly for Spring MVC (one-way only): @Bean GenericConverter uuidConverter() { return new GenericConverter() { @Override public Set getConvertibleTypes() { return Collections.singleton(new ConvertiblePair(String.class, UUID.class)); } @Override public Object convert(Object source, TypeDescriptor sourceType, TypeDescriptor targetType) { return UUID.fromString(source.toString()); } }; } Side note 2: if you can't change client, you can track duplicates by storing each requests' hash on the server side. This way when the same request is sent multiple times (retried by the client), it will be ignored. However sometimes we might have a legitimate use case for sending the exact same request twice (e.g. purchasing two swords within short period of time). Temporal coupling - client unavailability You think you're smart but PUT with retries is not enough. First of all a client can die while re-attempting failed requests. If server is severely damaged or down, retrying might take minutes or even hours. You can't simply block your incoming HTTP request just because one of your downstream dependencies is down - you must handle such requests asynchronously in background - if possible. But extending retry time increases probability of client dying or being restarted, which would loose our request. Imagine we received premium SMS but InventoryService is down at the moment. We can retry after second, two, four, etc., but what if InventoryService was down for couple of hours and it so happened that our service was restarted as well? We just lost that SMS and sword was never given to the gamer. An answer to such issue is to persist pending request first and handle it later in background. Upon SMS receive we barely store player ID in database table called pending_purchases. A background scheduler or an event wakes up asynchronous thread that will collect all pending purchases and try to send them to InventoryService (maybe even in batch?) Periodic batch threads running every minute or even second and collecting all pending requests will unavoidably introduce latency and unneeded database traffic. Thus I'm going for a Quartz scheduler instead that will schedule retry job for each pending request: @Slf4j @RestController class SmsController { private Scheduler scheduler; @Autowired public SmsController(Scheduler scheduler) { this.scheduler = scheduler; } @RequestMapping(value = "/sms/{phoneNumber}", method = POST) public void handleSms(@PathVariable String phoneNumber) { phoneNumberToPlayer(phoneNumber) .map(Player::getId) .map(this::purchaseSword) .orElseThrow(() -> new IllegalArgumentException("Unknown player for phone number " + phoneNumber)); } private UUID purchaseSword(UUID playerId) { UUID swordId = UUID.randomUUID(); InventoryAddJob.scheduleOn(scheduler, Duration.ZERO, playerId, swordId); return swordId; } //... } And job itself: @Slf4j public class InventoryAddJob implements Job { @Autowired private RestOperations restOperations; @lombok.Setter private UUID invId; @lombok.Setter private UUID playerId; @Override public void execute(JobExecutionContext context) throws JobExecutionException { try { tryPurchase(); } catch (Exception e) { Duration delay = Duration.ofSeconds(5); log.error("Can't add to inventory, will retry in {}", delay, e); scheduleOn(context.getScheduler(), delay, playerId, invId); } } private void tryPurchase() { restOperations.put(/*...*/); } public static void scheduleOn(Scheduler scheduler, Duration delay, UUID playerId, UUID invId) { try { JobDetail job = newJob() .ofType(InventoryAddJob.class) .usingJobData("playerId", playerId.toString()) .usingJobData("invId", invId.toString()) .build(); Date runTimestamp = Date.from(Instant.now().plus(delay)); Trigger trigger = newTrigger().startAt(runTimestamp).build(); scheduler.scheduleJob(job, trigger); } catch (SchedulerException e) { throw new RuntimeException(e); } } } Every time we receive premium SMS we schedule asynchronous job to be executed immediately. Quartz will take care of persistence (if application goes down, job will be executed as soon as possible after restart). Moreover if this particular instance goes down, another one can pick up this job - or we can form a cluster and load-balance requests between them: one instance receives SMS, another one requests sword in InventoryService. Obviously if HTTP call fails, retry is re-scheduled later, everything is transactional and fail-safe. In real code you would probably add max retry limit as well as exponential delay, but you get the idea. Temporal coupling - client and server can't meet Our struggle to implement retries correctly is a sign of obscure temporal coupling between client and server - they must live together at the same time. Technically this isn't necessary. Imagine gamer sending an e-mail with order to customer service which they handle within 48 hours, changing his inventory manually. The same can be applied to our case, but replacing e-mail server with some sort of message broker, e.g. JMS: @Bean ActiveMQConnectionFactory activeMQConnectionFactory() { return new ActiveMQConnectionFactory("tcp://localhost:61616"); } @Bean JmsTemplate jmsTemplate(ConnectionFactory connectionFactory) { return new JmsTemplate(connectionFactory); } Having ActiveMQ connection set up we can simply send purchase request to broker: private UUID purchaseSword(UUID playerId) { final Sword sword = new Sword(playerId); jmsTemplate.send("purchases", session -> { TextMessage textMessage = session.createTextMessage(); textMessage.setText(sword.toJson()); return textMessage; }); return sword.getUuid(); } By entirely replacing synchronous request-response protocol with messaging over JMS topic we temporally decouple client from server. They no longer need to live at the same time. Moreover more than one producer and consumer can interact with each other. E.g. you can have multiple purchase channels and more importantly: multiple interested parties, not only InventoryService. Even better, if you use specialized messaging system like Kafka you can technically keep days (months?) worth of messages without loosing performance. The benefit is that if you add another consumer of purchase events to the system next to InventoryService it will receive lots of historical data immediately. Moreover now your application is temporally coupled with broker so since Kafka is distributed and replicated, it works better in that case. Disadvantages of asynchronous messaging Synchronous data exchange, as used in ReST, SOAP or any form of RPC is easy to understand and implement. Who cares this abstraction insanely leaks from latency perspective (local method call is typically orders of magnitude faster compared to remote, not to mention it can fail for numerous reasons unknown locally), it's quick to develop. One true caveat of messaging is feedback channel. You can longer just "send" ("return") message back, as there is no response pipe. You either need response queue with some correlation ID or temporary one-off response queues per request. Also we lied a little bit claiming that putting a message broker between two systems fixes temporal coupling. It does, but now we are coupled to messaging bus - which can just as well go down, especially since it's often under high load and sometimes not replicated properly. This article shows some challenges and partial solutions to provide guarantees in distributed systems. But in the end of the day, remember that "exactly once" semantics are nearly impossible to implement easily, so double check you really need them.
March 13, 2015
·
19,353 Views
·
5 Likes
Comments
Mar 24, 2020 · Lindsay Burk
The last part is finally here: https://www.nurkiewicz.com/2020/03/graphql-server-in-java-part-iii.html
Nov 01, 2019 · Lindsay Burk
Because getOrDefault() doesn't add item to the map.
Oct 30, 2019 · Lindsay Burk
Well, imagine your system is composed of a few independent services and you have many heterogeneous clients (web, mobile, batch). All clients evolve separately. By creating a single GraphQL endpoint (in a facade) you allow each client to request data in needs in one go. Both clients and server aren't coupled by the API evolution. An alternative is an exponentially growing number of endpoints/versions and possibility of over-fetching or N+1 queries.
Oct 30, 2019 · Lindsay Burk
I appreciate the irony :-). Indeed I love SQL (especially because it's not JSON) but don't want to expose your SQL database to the client. Especially when there are dozens of databases and a handful of other services that you combine into a single request.
Jan 02, 2019 · James Sugrue
Because the number of buckets is smaller than the number of possible hash codes so, by definition, twój different hashes MAY end up in the same bucket
Aug 21, 2017 · Bartłomiej Słota
Very good article with clear narrative and examples. Thanks for sharing!
May 06, 2014 · Robert Greathouse
The fact that you have to wrap all nullable references with Optional is actually a good thing. This forces you to handle nulls on the type system (i.e. compiler) level. In Groovy you have shorter syntax, but still it's possible to access null and get NPE. Also the code doesn't document itself.
Finally names map and flatMap aren't random, they have a long history and they come from monads, available in functional languages.
However I do agree some JVM languages deal with nulls better, e.g. Kotlin with question mark after type.
May 05, 2014 · James Sugrue
Pierre, everything you say is obviously right. It was a mental shortcut from my side, thanks for clarification. With regards to key/value pair distribution - it's even worse. Keys with different hash codes may still land in the same bucket, even if otherwise they could be easily distributed.
Also I agree with your statement about order of entries in a map. But unspecified order in HashMap is so strongly emphasized that I don't believe it will impact reasonably written software. On the other hand I saw unit tests breaking after migration from Java 6 to 7 because order of entries in HashMap changed - so it's not really the first time.
May 05, 2014 · James Sugrue
Pierre, everything you say is obviously right. It was a mental shortcut from my side, thanks for clarification. With regards to key/value pair distribution - it's even worse. Keys with different hash codes may still land in the same bucket, even if otherwise they could be easily distributed.
Also I agree with your statement about order of entries in a map. But unspecified order in HashMap is so strongly emphasized that I don't believe it will impact reasonably written software. On the other hand I saw unit tests breaking after migration from Java 6 to 7 because order of entries in HashMap changed - so it's not really the first time.
May 25, 2013 · Allen Coin
Thank you for your valuable comment!
Ad. 1: I took the concept and much of implementation details from Scala (Stream). Possibility to express infinite streams of data without actually evaluating them is a core concept. In Java 8 streams are mainly just wrappers around collections (with few exceptions, e.g. java.util.Random.ints() - infinite) so laziness would give nothing. But LazySeq focuses on lazy, expensive to compute, typically infinite sequences. Therefore I believe laziness is the core concept, not a detail or option. Of course it's also not a general-purpose List implementation, it is very special purpose.
Ad. 2: Memoization helps reusing already computed values. But I explain clearly how it can blow away your heap and thus should be used with care. This applies to any collection in any language - don't try to put too much into memory. On the other hand this data structure without memoization can be easily turned into a (recursive) function.
Ad. 3: head/tail concept was taken, again, from Stream in Scala. If you look at the source code, most operations like map() and filter() are insanely simple to implement on top of that. E.g.
seq.map(f)
turns intof(head)
concatenated with (lazy)tail.map(f)
. head/tail also simplifies thinking about data structure (first element is eager, rest is lazy)But I agree that linked list is slow and wasteful. An alternative would be a small array (10-16 elements) in head and lazy rest. But this would be much more cumbersome to implement. Moreover I can't simply wrap existing collection as this could violate immutability. And did I mention that this was a toy-project to play with Java 8 and have fun? :-)
Thank you again for taking time to critique my tiny library.
May 25, 2013 · Allen Coin
Thank you for your valuable comment!
Ad. 1: I took the concept and much of implementation details from Scala (Stream). Possibility to express infinite streams of data without actually evaluating them is a core concept. In Java 8 streams are mainly just wrappers around collections (with few exceptions, e.g. java.util.Random.ints() - infinite) so laziness would give nothing. But LazySeq focuses on lazy, expensive to compute, typically infinite sequences. Therefore I believe laziness is the core concept, not a detail or option. Of course it's also not a general-purpose List implementation, it is very special purpose.
Ad. 2: Memoization helps reusing already computed values. But I explain clearly how it can blow away your heap and thus should be used with care. This applies to any collection in any language - don't try to put too much into memory. On the other hand this data structure without memoization can be easily turned into a (recursive) function.
Ad. 3: head/tail concept was taken, again, from Stream in Scala. If you look at the source code, most operations like map() and filter() are insanely simple to implement on top of that. E.g.
seq.map(f)
turns intof(head)
concatenated with (lazy)tail.map(f)
. head/tail also simplifies thinking about data structure (first element is eager, rest is lazy)But I agree that linked list is slow and wasteful. An alternative would be a small array (10-16 elements) in head and lazy rest. But this would be much more cumbersome to implement. Moreover I can't simply wrap existing collection as this could violate immutability. And did I mention that this was a toy-project to play with Java 8 and have fun? :-)
Thank you again for taking time to critique my tiny library.
May 25, 2013 · Allen Coin
Thank you for your valuable comment!
Ad. 1: I took the concept and much of implementation details from Scala (Stream). Possibility to express infinite streams of data without actually evaluating them is a core concept. In Java 8 streams are mainly just wrappers around collections (with few exceptions, e.g. java.util.Random.ints() - infinite) so laziness would give nothing. But LazySeq focuses on lazy, expensive to compute, typically infinite sequences. Therefore I believe laziness is the core concept, not a detail or option. Of course it's also not a general-purpose List implementation, it is very special purpose.
Ad. 2: Memoization helps reusing already computed values. But I explain clearly how it can blow away your heap and thus should be used with care. This applies to any collection in any language - don't try to put too much into memory. On the other hand this data structure without memoization can be easily turned into a (recursive) function.
Ad. 3: head/tail concept was taken, again, from Stream in Scala. If you look at the source code, most operations like map() and filter() are insanely simple to implement on top of that. E.g.
seq.map(f)
turns intof(head)
concatenated with (lazy)tail.map(f)
. head/tail also simplifies thinking about data structure (first element is eager, rest is lazy)But I agree that linked list is slow and wasteful. An alternative would be a small array (10-16 elements) in head and lazy rest. But this would be much more cumbersome to implement. Moreover I can't simply wrap existing collection as this could violate immutability. And did I mention that this was a toy-project to play with Java 8 and have fun? :-)
Thank you again for taking time to critique my tiny library.
Jan 31, 2013 · Allen Coin
Jan 31, 2013 · Allen Coin
Jan 31, 2013 · Allen Coin
Jan 24, 2013 · Allen Coin
@Jim O'callaghan - have a look at this discussion (bottom) - seems like you have the same issue.
Jan 24, 2013 · Allen Coin
@Jim O'callaghan - have a look at this discussion (bottom) - seems like you have the same issue.
Jan 24, 2013 · Allen Coin
@Jim O'callaghan - have a look at this discussion (bottom) - seems like you have the same issue.
Nov 06, 2012 · James Sugrue
Don't get me wrong, asynchrounous servlets are great. It's only the AsyncContext.start() method that is not really that much useful. But the concept of asynchronous servlets (including the ability to server multiple requests using just one thread) is a great addition to standard.
Nov 06, 2012 · James Sugrue
Don't get me wrong, asynchrounous servlets are great. It's only the AsyncContext.start() method that is not really that much useful. But the concept of asynchronous servlets (including the ability to server multiple requests using just one thread) is a great addition to standard.
Nov 03, 2012 · James Sugrue
Thanks for in-depth explanation. I hope I described the warm-up functionality clearly by saying: it will gradually increase allowed frequency over configured time up to configured maximum value instead of allowing maximum frequency from the very beginning (?)
Nov 03, 2012 · James Sugrue
Thanks for in-depth explanation. I hope I described the warm-up functionality clearly by saying: it will gradually increase allowed frequency over configured time up to configured maximum value instead of allowing maximum frequency from the very beginning (?)
Nov 03, 2012 · James Sugrue
Indeed in some circumstances RateLimiter knows in advance that tryAcquire() cannot succeed, very smart!
Nov 03, 2012 · James Sugrue
Indeed in some circumstances RateLimiter knows in advance that tryAcquire() cannot succeed, very smart!
Oct 20, 2012 · James Sugrue
Oct 20, 2012 · James Sugrue
Oct 10, 2012 · James Sugrue
Even better, use FEST assertions:
assertThat(result1).isEqualTo(42);
assertThat(result2).contains("foo");
Note that your example is a bit confusing. You are checking whether "result" variable is equal to 42 and contains "foo" string - at the same time.
Oct 10, 2012 · James Sugrue
Or even better, use FEST assertions:
assertThat(result1).isEqualTo(42);
assertThat(result2).contains("foo");
BTW because you are using the same variable "result" from your code sample it seems like this variable should be both 42 (int) and contain "foo" string...
Dec 22, 2011 · James Sugrue
Mar 22, 2011 · James Sugrue
Oct 15, 2010 · Shekhar Gulati
Oct 14, 2010 · James Sugrue
Sep 01, 2010 · James Sugrue
This code definitely needs more unit tests than traditional approach. Also there is a ready-made, more flexible and elegant solution, Dozer. But both approaches suffer the same problem - while analyzing source codes, string/reflection/other magic methods of copying data from one object to another aren't easy to spot. After doing some development with Dozer I gave it up and use traditional, property-by-property approach. Simplicity is king! P.S.: Although, I find this idea being really cool: BEANUTILS-375.