Cluster Architecture of Cedalo Pro Mosquitto Broker
Introduction
Pro Mosquittos cluster feature enables the creation of different types of clusters of Pro Mosquitto nodes with a single leader and multiple followers. This architecture ensures that the broker remains available even if one node fails or goes offline.
Cluster Modes
The Mosquitto cluster operates in two distinct modes, both of which require a minimum of three brokers to ensure data synchronization in the event of a broker failure.
1. High Availability Mode (Active-Passive)
In High Availability mode, the cluster functions as an active-passive setup. This means that only one node is active at any given time, while the remaining nodes act as failover nodes. Key features of this mode include:
- Active node synchronization: The active node continuously synchronizes MQTT session information and authentication data with the follower nodes.
- Seamless failover: If the active node fails, a follower node takes over as the leader, allowing clients to reconnect as if nothing changed.
- Client-to-client communication: This mode supports communication between clients connected to the broker.
Best suited for:
- Client-to-client communication: Ideal when your MQTT clients need to exchange messages directly with each other. This mode ensures all session states and messages are synchronized across the cluster.
- Complete MQTT session synchronization: Ensures that if the active node fails, clients can reconnect to the new leader with their session state fully preserved.
- High message delivery reliability: Provides seamless failover without data loss, ensuring reliable message delivery.
Recommended for:
- Use cases where reliable message delivery and session persistence are critical.
2. Dynamic-Security Sync Mode (Active-Active)
Dynamic-Security Sync mode operates as an active-active cluster, where all nodes can accept client connections simultaneously. Key characteristics of this mode include:
- Dynamic security synchronization: Only role-based access control (dynamic security authentication) data is synchronized between nodes, not MQTT session data.
- No client-to-client communication: Clients can connect to any node and data flows only to a backend service.
Best suited for:
- High throughput applications: Distributes client load across multiple nodes, improving performance and handling a large number of simultaneous connections.
- Client-to-backend communication: Perfectly suited for data collection scenarios where devices send telemetry data to a backend (e.g., monitoring systems, IoT sensor data collection). Works well when MQTT clients do not need to communicate with each other but need to send data to a backend service or data processing pipeline.
- Simplified scalability: Can be scaled effortlessly, enabling the addition of nodes to accommodate increased load.
Considerations:
- Limited client interaction: Clients cannot directly communicate with each other, which limits collaborative use cases.
Choosing the Cluster Mode
When deciding on a cluster mode for your Mosquitto broker setup, consider the following circumstances:
- Client Communication Needs: If clients need to communicate with each other directly, opt for a mode that supports this feature.
- Session Synchronization Requirements: Choose a mode that ensures session state is fully synchronized if message reliability and seamless failover are critical.
- Throughput and Load Distribution: For high-throughput applications, consider a mode that allows multiple clients to connect simultaneously without bottlenecks.
- Backend Interaction: If clients primarily send data to a backend without needing to interact with each other, select a mode focused on optimizing backend communication.
- Scalability: Assess how easy it is to scale the cluster with the chosen mode, particularly if your system requires handling a large number of connections.
Criteria | Full Sync Mode (Active-Passive) | Dynamic-Security Sync Mode (Active-Active) |
---|---|---|
Client-to-Client Communication | Supported (clients can interact with each other) | Not supported (clients interact with backend only) |
Session State Synchronization | Fully synchronized | Role-based access control synchronized |
Failover Handling | Seamless client reconnection with full state | Client reconnection without session state |
Load Distribution | All clients connect to the leader node | Clients can connect to any node |
Scalability | Limited by leader's capacity | High scalability, balanced load |
Criteria/Setup | Single Node | HA - Full Sync (Active-Passive) | HA - Dynamic-Security Sync (Active-Active) |
---|---|---|---|
All Nodes Available | ✅ | ❌ Only leader | ✅ |
MQTT Session Synchronization | Not needed | ✅ | ❌ |
Authentication & Authorization Sync | Not needed | ✅ | ✅ |
Cluster Architecture
Overview
The Mosquitto cluster architecture comprises a minimum of three broker nodes to maintain high availability and data consistency. Each cluster setup includes:
- A leader node that manages the active connections.
- Two or more follower nodes that are available as failovers to take over as the leader if needed.
If the cluster state degrades to having only a single node available, clients will not be able to connect until at least two nodes are restored.
Cluster Operation with Raft Consensus Algorithm
The Mosquitto broker cluster uses the Raft consensus algorithm to maintain consistency and reliability across the nodes. Consensus is a fundamental problem in fault-tolerant distributed systems, where multiple servers must agree on shared values. Here’s a summary of how the cluster operates using the Raft algorithm:
- Consensus Agreement: Raft ensures that all servers in the cluster agree on values. Once a decision is made, it is final and cannot be undone.
- Fault Tolerance: Raft will only make progress if a majority of servers are available. For example, in a cluster of 5 nodes, the system can continue to operate even if up to 2 nodes fail or are not reachable. If more servers fail, the cluster stops making progress but will not produce incorrect results.
- Replicated State Machines: Each server in the cluster maintains a replicated state machine and a log. These mechanisms help ensure that even in the case of node failures, the cluster maintains data integrity and provides consistent operations.
- Majority: Majority is a key component in the cluster design. Only servers, which detect to be a working cluster and are a majority will continue to allow connections. This is especially important in case of network downtimes, where nodes may create two clusters instead of one. Only the set of nodes which is in the majority will be available and therefore only one "truth" is created.
This approach to using Raft consensus makes the Mosquitto cluster robust, ensuring that even in cases of node failures, the data remains consistent, and the overall system behavior is predictable.
Cluster Architecture Diagram
The diagrams below illustrate the suggested architecture for the two cluster modes. Each setup includes three broker nodes operating Mosquitto and a fourth node running the Mosquitto Management Center (MMC).
Figure 1: High Availability Mode
Figure 2: Dynamic-Security Sync Mode
Conclusion
The Cedalo Mosquitto high-availability cluster, built on the RAFT protocol, offers robust failover capabilities and flexible clustering options. By choosing the appropriate cluster mode—High Availability for client-to-client communication or Dynamic-Security Sync for isolated clients—you can ensure that your MQTT broker setup is resilient, scalable, and tailored to your specific use cases.