Exploring the Architecture of Amazon SQS

In this article, learn how Amazon SQS works and how it can be used to create distributed applications, explore the components of SQS, and its architecture.

Satrajit Basu

CORE ·

Feb. 25, 23 · Analysis

Likes (3)

Comment

Save

10.4K Views

What Is Amazon SQS?

Amazon SQS (Simple Queue Service) is a message queue service that enables application components to communicate with each other by exchanging messages. This is widely used to build event-driven systems or decouple services on AWS.

Features of Amazon SQS

Message persistence: Messages are stored in queues until they are delivered or deleted by the sending or receiving endpoints.
Guaranteed message delivery: Messages are delivered at least once and in the same order as they are sent.
Message redelivery: If a message is not acknowledged, it will be resent up to three times before being deleted from the queue.
Visibility timeout: Messages can be set to expire and deleted after a set amount of time, even if they have not been delivered yet.

Benefits of Amazon SQS

Amazon SQS is a highly reliable and scalable message queuing service that enables you to reliably connect your applications. It provides the following benefits:

Low cost of ownership: Amazon SQS is cost-effective due to its pay-per-use pricing model and the ability to use any of the AWS services.
High throughput: It can handle more than 1 million messages per second with a latency below 1 ms.
Fault tolerance: The service is designed to be highly available and durable with no single point of failure.
Security: Amazon SQS uses TLS and message signing when sending messages between clients, as well as authentication mechanisms for clients accessing the service.

How To Process Messages in a FIFO Order With SQS

Wait, aren’t queues supposed to be First-in-First-out by design? Well, yes, but with SQS, it gets a little complicated. AWS claims they make the best effort to process messages in sequence, but, now and then, there can be instances of messages getting processed out of turn.

SQS has a distributed architecture with lots of redundancy. This means there is more than one message store. At runtime, messages are picked up randomly from one of the stores.

Let me try and explain this better with an analogy. Suppose a group of three people have gone to purchase tickets at a railway station ticket counter. They decide to stand in three different queues, and whoever gets the tickets first will notify the others so they can come out of the queue—distributed and redundant. Let us assume that all three queues have the same number of people at that instance. As luck would have it, another group of three people come into the station at almost the same time. As they were splitting up into three queues, a person from the second group managed to move past the first group and was ahead in one of the queues. At their turn, the person from the second group got the tickets before the other group—not desired but can happen at times.

So, what is the way out? If a strict FIFO order is required, AWS recommends using a FIFO queue. A FIFO SQS, unlike the standard SQS, guarantees strict ordering.

As an analogy, suppose you go into a bank and are immediately handed over a token. Token holders are served sequentially, ruling out the possibility of someone getting served out of turn.

How Are Messages Processed?

Standard SQS queues guarantee “at least once” processing. This means messages won’t be lost and would be processed at least once. But what is “at least once” supposed to mean? Does it mean that messages can be processed more than once? Well, the answer is yes.

Let us first look at a message lifecycle. The following are the stages:

A message is put onto a queue.
It gets picked up by the consumer.
Once processed, the consumer deletes it from the queue.

Note: At post-processing, the message is not automatically deleted—it has to be explicitly deleted by the consumer.

Between stages #2 and #3, the message is “in-flight.” When a message is in-flight, a visibility timeout comes into play that suppresses the message in the queue so it is not processed again. The visibility timeout can be configured, the default being 30 secs. The idea is that a message has to be processed and subsequently deleted from the queue before the visibility timeout expires to avoid duplicate processing.

However, there can be times when a message gets stuck while processing, resulting in the visibility timeout expiring and the message getting picked up again by the consumer. Also, it can so happen that, during the delete process, one of the servers gets off the hook, and the message lives on that particular server. When it comes back up, the message gets processed again. So, it is absolutely necessary to design the applications to be idempotent when using a standard SQS. That is, even if a message is processed more than once, it shouldn’t have any business impact.

Coming back to our railway station example, let us assume that once a person in a group gets the tickets, he will text all the others in the group. But, while sending the text, if the mobile of one of the receivers gets off the network, that person will not receive the message and will purchase the tickets again. This is a common scenario where a message can be processed more than once.

Messages in the FIFO SQS, on the other hand, get processed exactly once. This leads us to another important topic—how are duplicate messages handled? The standard SQS doesn’t care if you put in duplicate messages—the downline application is supposed to be idempotent.

The FIFO queue, on the other hand, does not allow duplicates. It creates a deduplication id, which is essentially a hash value based on the payload. However, if the same message has to be processed within a small time window: the default deduplication id will not work, and a custom random deduplication id has to be created and will allow all messages coming in to be processed even if it is exactly the same to one of the previous messages.

Which Type of SQS Should You Use?

As a rule of thumb, you should always look to use the standard SQS. It is distributed, redundant, and comes with unlimited throughput. After all, it is designed to scale and serve all types of workloads with considerable ease. However, if a strict order is of the utmost importance for the application you are building and you don’t care much about throughput, then, obviously, FIFO will be your best choice.

AWS Architecture Fault tolerance Message queue consumer Id (programming language) MEAN (stack) Processing Timeout (computing) Visibility (geometry) Scalability authentication security Throughput (business)

Opinions expressed by DZone contributors are their own.

Related

Trending