Runbook

MongoDB replica member unhealthy incident.

Back to Runbooks

Overview

This incident type refers to an issue with a MongoDB replica set, where one or more members of the set have been marked as unhealthy. This can happen due to various causes, such as network issues, hardware failures, or configuration problems. When this occurs, it can impact the availability and performance of the database system, which can lead to data loss or corruption. Prompt resolution of this incident is necessary to prevent further damage and restore the normal functioning of the replica set.

Parameters

Debug

Check if MongoDB is running

Check the replica set status

Check the replica set configuration

Check the replica set members

Get the MongoDB log file

Check the disk usage

Check the memory usage

Check the MongoDB process ID

Check the CPU usage

Check the MongoDB replica set members status

Check the MongoDB replica set members health

Check the MongoDB replica set members state

Check the MongoDB version

Check the MongoDB storage engine

Check the MongoDB memory usage

Check the MongoDB network usage

Check the MongoDB oplog size

Check the MongoDB oplog window

Check the MongoDB oplog length

Check the MongoDB oplog utilization

Check the MongoDB oplog capacity

Check the MongoDB oplog status

Check the MongoDB oplog sync status

Check the MongoDB oplog lag time

Check the MongoDB oplog sync source

Check the MongoDB oplog sync state

Repair

Define the IP addresses of the MongoDB replica members

Check the network connectivity between MongoDB replica members

Verify the MongoDB configuration file for replica set configuration and ensure that it has correct replica set name, members list, and priority settings.

Restart the MongoDB replica set members one by one to ensure that the latest data is replicated to all members and the issue is resolved.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.