Rook (Ceph Operator) Insufficient number of monitor pods to maintain quorum.

The Ceph cluster is unable to maintain a quorum due to an inadequate number of monitor pods.

Understanding Rook (Ceph Operator)

Rook is an open-source cloud-native storage orchestrator for Kubernetes, providing a platform, framework, and support for Ceph storage systems. Ceph is a highly scalable distributed storage system that provides object, block, and file storage. Rook automates the deployment, bootstrapping, configuration, scaling, upgrading, and monitoring of Ceph clusters.

For more details on Rook, visit the official Rook website.

Identifying the Symptom: TOO_FEW_MONS

When managing a Ceph cluster with Rook, you might encounter the error code TOO_FEW_MONS. This error indicates that the cluster does not have enough monitor (mon) pods to maintain a quorum. A quorum is essential for the cluster to function correctly, as it ensures consistency and availability of the data.

Typically, this issue manifests as a warning or error message in the Rook operator logs or the Ceph status output.

Exploring the Issue: Insufficient Monitor Pods

The TOO_FEW_MONS error arises when the number of active monitor pods falls below the required threshold to maintain a quorum. Ceph requires a majority of monitors to be available to make decisions about the cluster state. For example, in a cluster with three monitors, at least two must be operational.

This situation can occur due to various reasons, such as pod failures, network issues, or resource constraints.

Steps to Resolve the TOO_FEW_MONS Issue

Step 1: Verify Current Monitor Status

First, check the status of the monitor pods to identify which ones are down. You can do this by running the following command:

kubectl -n rook-ceph get pods -l app=rook-ceph-mon

This command lists all the monitor pods and their current status.

Step 2: Check Ceph Cluster Health

Next, check the overall health of the Ceph cluster to confirm the quorum status:

kubectl -n rook-ceph exec -it -- ceph status

Look for the quorum status and the number of active monitors.

Step 3: Scale Up Monitor Pods

If the number of monitors is insufficient, you need to scale up the monitor deployment. Edit the CephCluster custom resource to increase the number of monitors:

kubectl -n rook-ceph edit cephcluster rook-ceph

In the editor, find the mon section and increase the count value:

spec:
mon:
count: 3

Save and exit the editor. Rook will automatically create additional monitor pods to meet the new count.

Step 4: Monitor the Changes

After scaling up, monitor the status of the new monitor pods:

kubectl -n rook-ceph get pods -l app=rook-ceph-mon

Ensure that all monitor pods are running and ready. Recheck the Ceph cluster status to confirm that the quorum is restored:

kubectl -n rook-ceph exec -it -- ceph status

Conclusion

Maintaining the correct number of monitor pods is crucial for the stability and reliability of a Ceph cluster managed by Rook. By following the steps outlined above, you can resolve the TOO_FEW_MONS issue and ensure that your cluster maintains a healthy quorum.

For further reading, refer to the Rook Ceph Quickstart Guide.

Master

Rook (Ceph Operator)

in Minutes — Grab the Ultimate Cheatsheet

(Perfect for DevOps & SREs)

Most-used commands
Real-world configs/examples
Handy troubleshooting shortcuts
Your email is safe with us. No spam, ever.

Thankyou for your submission

We have sent the whitepaper on your email!
Oops! Something went wrong while submitting the form.

Rook (Ceph Operator)

Cheatsheet

(Perfect for DevOps & SREs)

Most-used commands
Your email is safe with us. No spam, ever.

Thankyou for your submission

We have sent the whitepaper on your email!
Oops! Something went wrong while submitting the form.

MORE ISSUES

Made with ❤️ in Bangalore & San Francisco 🏢

Doctor Droid