This incident type refers to a situation where a Cassandra node becomes unavailable. This can cause interruptions to the expected functionality of the system and can lead to data loss or corruption. The cause of this incident can be due to a variety of reasons such as hardware failure, network issues, or software bugs. It is critical to resolve the issue as soon as possible to minimize the impact on the system and ensure the smooth functioning of the application.
Parameters
Debug
Check Cassandra service status
Check Cassandra process status
Check Cassandra system logs
Check Cassandra node health
Check Cassandra node ring information
Check Cassandra node gossip information
Check Cassandra node log for errors
Check Cassandra node log for warnings
Check for any network connectivity issues
Check for any firewall issues
Repair
Attempt to repair the Cassandra installation or reinstall it if necessary.
Restore from a recent backup if data loss or corruption has occurred.
Reboot the node to attempt to clear any software issues that may be causing the unavailability.
Restore from a cassandra backup if data loss or corruption has occurred.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.