Runbook

RabbitMQ Cluster Split Brain Incident

Back to Runbooks

Overview

A RabbitMQ Cluster Split Brain incident occurs when a group of RabbitMQ nodes loses connectivity with another group, leading to two separate clusters. This results in message duplication, message loss, and other undesirable consequences. The incident requires immediate attention from the system administrators to fix the problem and restore normal operations.

Parameters

Debug

Check RabbitMQ cluster status

Check the status of all RabbitMQ nodes in the cluster

Check the RabbitMQ configuration files for any misconfigurations

Check the network connectivity between RabbitMQ nodes

Check the RabbitMQ queues for any inconsistencies or errors

Check the RabbitMQ exchange bindings for any inconsistencies or errors

Check the RabbitMQ virtual hosts for any inconsistencies or errors

Repair

Restart the RabbitMQ cluster nodes to resolve the split-brain condition.

Review the RabbitMQ configuration and adjust the settings to prevent future split-brain scenarios.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.