And why we are updating our plan structure to include Quorum Queues.
We just published our new plans! There is a huge difference in them compared to earlier plans, the main difference being our decision to switch away from classic mirrored queues. This is the story of the reasons we are switching away from classic mirrored queues in favor of using quorum queues.
Perhaps one of the most significant changes in RabbitMQ 3.8 was the new queue type called Quorum Queues. This is a replicated queue to provide high availability and data safety. Soon after introducing quorum queues, the RabbitMQ team wrote this on their website:
Obviously, we had to investigate, and here is what we found.
Mirrored Queues inefficient algorithm
The main problems revolve around the synchronization model and performance. Performance is slower than it should be because messages are replicated using a very inefficient algorithm. HA queue synchronization is a troublesome topic that RabbitMQ administrators often have to learn the hard way to work with.
Mirrored queues work with a single leader queue and one or more mirror queues. All reads and writes go through the leader queue, which replicates all the commands (write, read, ack, nack, etc.) to the mirrors. Once all the live mirrors have the message, the leader will send a confirmation to the publisher. At this point, if the leader failed, a mirror would get promoted to leader and the queue would remain available, with no data loss. This pointed out a couple of design flaws.
Design flaw #1 - When a broker goes offline and comes back again
The basic problem is that when a broker goes offline and comes back again, any data it had in mirrors gets discarded. This is critical design flaw #1. Now that the mirror is back online but empty, the administrators have a decision to make: to synchronize the mirror or not. "Synchronize" means to replicate the current messages from the leader to the mirror.
Design flaw #2 - Synchronization blocking
That's where critical design flaw #2 comes in. Synchronization blocks the whole queue, causing it to become unavailable. That's not normally a problem if the queues are short and the messages in them are semi small. But if the queues are very long (>1M msgs) or very large (>1GB) this can take a very long time. Messages get published and consumed at the same rate and messages remain in the queue for a very short time. But sometimes a queue can grow large, either by choice or because a downstream system is slow or offline. In the meantime, the system remains available but accumulating messages in its queues.
If you have no messages or a few thousand small messages, then the impact of synchronization is small. Synchronization will be quick and publishers can resend any messages that don't get accepted by the broker while it is unavailable. But when a queue is large, the impact is much greater. It can take minutes, hours, or in very extreme cases, even days to synchronize (though in most of the cases when we see this, the server crashes before it finishes, and then it needs to start over again). Not only that, but synchronization has been known to cause memory-related issues on the cluster, sometimes even causing synchronization to get stuck requiring a reboot.
All mirrored queues synchronize automatically, by default, but sometimes administrators simply choose not to synchronize a mirror. All new messages would get replicated but any existing messages would not, causing reduced redundancy and exposing the cluster to a greater chance of message loss.
These issues also made rolling upgrades problematic as a rebooted broker would discard all its data and require synchronization to recover full data redundancy.
Quorum Queues
Quorum queues aim to resolve both the performance and the synchronization failings of mirrored queues. Using a variant of the Raft protocol which has become the industry de facto distributed consensus algorithm, quorum queues are both safer and achieve higher throughput than mirrored queues. So, what does that mean?
Each quorum queue is a replicated queue; it has a leader and multiple followers. A quorum queue with a replication factor of five will consist of five replicated queues: the leader and four followers. Each replicated queue will be hosted on a different node (broker).
Clients (publishers and consumers) always interact with the leader, which then replicates all the commands (write, read, ack, etc.) to the followers. The followers do not interact with the clients at all; they exist only for redundancy, allowing availability when a RabbitMQ broker fails, is shut down or gets rebooted. When a broker goes offline, a follower replica on another broker will be elected leader and service will continue.
Quorum queues have their name because all operations (message replication and leader election) require a majority (known as a quorum) of the replicas to agree. When a publisher sends a message, the queue can only confirm it once a majority of replicas have written the message to disk. This means that a slow minority does not slow down the queue as a whole. Likewise, a leader can only be elected when a majority agrees to it, and this prevents two leaders from accepting messages when a network partition occurs. So quorum queues are oriented towards consistency over availability.
CloudAMQP plan Specification
1/3/5 nodes: Cold Standby setup
Monitoring tools automatically detect if a node goes down, and if that happens, a new instance (a cold standby system) is available within seconds and the data is mounted on the new instance. The data located on EBS drives are replicated 3 times for each EBS drive (i.e 15 times for 5 node cluster).
You get a single URL to the instance. Failover and load balancing between nodes is done over DNS, with a low TTL (30s). Users connecting over AWS PrivateLink. are load-balanced over elastic load balancing (ELB) instead of DNS load balancer.
Multi-AZ on AWS and GCE
Multi-Availability Zone (AZ) means that the nodes are located in different availability zones. An Availability Zone is an isolated location within a region and a region is a separated geographic area (like EU-West or US-East). Multi-AZ is a standard setting for all clusters (with at least three nodes) for CloudAMQP instances hosted in AWS and GCE.
Availability set on Azure
CloudAMQP Instances in Azure were placed in an availability sets until December 21, 2022. Azure makes sure that all nodes within an availability set run across multiple physical servers etc. Availability zones in Azure are supported by CloudAMQP for all clusters created since December 21, 2022.
Poison Message Handling
Quorum queues support the handling of poison messages, which are messages that cause consumers to requeue messages due to a failure of some kind. Poison messages are never consumed completely and therefore never positively acknowledged so that they can be marked for deletion by RabbitMQ. Quorum queues keep track of unsuccessful delivery attempts and include the information in the “x-delivery-count” header added to any redelivered message.
MQTT and quorum queues
MQTT is using raft for client tracking. The protocol is not working on 2 nodes (it requires a quorum). We therefore recommend users on a two node cluster using MQTT to upgrade to the new plans.
1 node: Best performance, cold standby for high availability, always consistent
One node plans are fastest and simple, all data that is written to disk is safe, but if you use transient messages or non-durable queues you might lose messages if there's a hardware failure and the node has to be restarted.
In order to avoid losing messages in a single node broker, you should be prepared for broker restarts, broker hardware failure, or broker crashes. To ensure that messages and broker definitions survive restarts, ensure that they are on the disk. Messages, exchanges, and queues that are not durable and persistent will be lost during a broker restart.
If you cannot afford to lose any messages, make sure that your queue is declared as “durable” and that messages are sent with delivery mode "persistent". The messages will be saved to disk and everything will be intact when the node comes back up again.
3 nodes cluster: Data replicated over 3 nodes in 2 AZs
A CloudAMQP cluster with three nodes gives you three RabbitMQ servers. These servers are placed in different zones (availability zones in AWS) in all data centers with support for zones.
Classic mirrored queues
The queues are automatically mirrored between nodes. Each mirrored queue consists of one leader and one or more followers. Messages published to the queue are replicated to all followers. RabbitMQ HA is used to handle this via a default policy.
Classic queues used in CloudAMQP, in a 3 node cluster, are be default mirrored into two nodes.
Quorum Queues
The default quorum queue cluster size is set to the same value as the number of nodes in the cluster. This argument can be overridden with queue the argument quorum_cluster_size.
Read more about configuration options here: https://www.rabbitmq.com/quorum-queues.html#configuration
5 nodes: Data replicated over 5 nodes in 3 AZs
A CloudAMQP cluster with five nodes gives you five RabbitMQ servers. The servers are divided into different availability zones in data centers with support for zones. Two nodes will be allocated in one zone, two in the next, and the final node in a third zone.
Classic mirrored queues
The queues are automatically mirrored between nodes. Each mirrored queue consists of one leader and one or more followers. Messages published to the queue are replicated to all followers. RabbitMQ HA is used to handle this via a default policy.
Classic queues in a 5 node cluster are mirrored into two nodes.
Quorum Queues
The default quorum queue cluster size is set to the same value, as the number of nodes in the cluster. This argument can be overridden with queue arguments: https://www.rabbitmq.com/quorum-queues.html#configuration
Frequently Asked Questions
Q: Are mirrored queues still supported?
A: Yes, mirrored queues will still behave in the same way as before, with the difference that CloudAMQP mirrored queues are only mirrored to two nodes by default. This can be overridden to 3, 4 or 5 nodes but is not advisable by CloudAMQP due to a lot of intra cluster traffic.
Q: Do I need to change anything in my code to use quorum queues?
A: Yes, the queue has to be declared as a quorum queue using a queue declare argument, but apart from that you publish and consume from it like normally. With publish confirm and manual acknowledgement of course.
Certain features are not available for Quorum Queues. Check if your current setup requires any of the following features, and if it does, reach out to us for help so that we can discuss your setup.
These features are not available with Quorum Queues:
- Non-durable messages
- Queue Exclusivity
- Some policies are not available. Only dead letter exchange and length limits are available.
- Priorities
Q: Why do you not set up a plan with 7 or more nodes?
A: Performance can be affected when quorum queue node sizes larger than 5 are in place. Also, AMQP doesn't support client steering, so it's hard to scale out a rabbitmq cluster horizontally.
Read more
- Quorum Queues - RabbitMQ.com
- RabbitMQ 3.8 Feature Focus - Quorum Queues
- Quorum Queues Internals - A Deep Dive
- [Video] Lifting the lid on Quorum Queues (25 min)