Runbook

Vault cluster health incident on kubernetes

Back to Runbooks

Overview

The Vault cluster health incident is related to the health of a Vault cluster instance. This incident type is triggered when the cluster instance is not healthy and requires attention to ensure it is functioning properly. The incident typically involves evaluating the current state of the cluster instance, diagnosing the issue, and taking corrective action to restore the health of the instance.

Parameters

Debug

Get the list of pods running in the Kubernetes cluster

Get the logs of a specific pod

Check the status of a Kubernetes deployment

Describe a Kubernetes pod to get more information about its state

Check if the Vault cluster is running and accessible

Check the logs of the Vault server to see if there are any errors

Check the health of the Vault cluster nodes

A network issue that is affecting the connectivity or performance of the Vault cluster instance.

Repair

Check the logs of the Vault cluster instance to identify the root cause of the issue. This could involve looking for error messages or other indicators of problems with the instance.

Restart the Vault cluster instance to see if this resolves the issue. This could involve stopping and starting the instance, or rebooting the server if necessary.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.