Debug Your Infrastructure

Get Instant Solutions for Kubernetes, Databases, Docker and more

AWS CloudWatch
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Pod Stuck in CrashLoopBackOff
Database connection timeout
Docker Container won't Start
Kubernetes ingress not working
Redis connection refused
CI/CD pipeline failing

OpenSearch Cluster Node Disk Full

A node's disk is full, preventing further data operations.

Understanding OpenSearch

OpenSearch is a powerful, open-source search and analytics suite derived from Elasticsearch. It is designed to provide users with a robust platform for searching, analyzing, and visualizing data in real-time. OpenSearch is commonly used for log analytics, full-text search, and operational monitoring, making it a critical component in many data-driven applications.

Symptom: Cluster Node Disk Full

In OpenSearch, the Cluster Node Disk Full alert is a critical warning indicating that one or more nodes in your cluster have reached their disk capacity limits. This alert is typically triggered by Prometheus, a popular monitoring tool, which continuously checks the health and performance metrics of your OpenSearch cluster.

Details About the Alert

Why This Alert Occurs

The Cluster Node Disk Full alert occurs when the disk space on a node is exhausted. This situation can prevent the node from performing essential data operations, such as indexing new documents or replicating data across the cluster. As a result, the overall performance and reliability of your OpenSearch cluster may be compromised.

Impact on Cluster Operations

When a node's disk is full, it can lead to several issues, including:

  • Inability to write new data to the node.
  • Potential data loss if the node cannot replicate data to other nodes.
  • Increased load on other nodes, potentially leading to further performance degradation.

Steps to Fix the Alert

Step 1: Identify the Affected Node

First, determine which node(s) have full disks. You can use the OpenSearch _cat/nodes API to check the disk usage of each node:

GET _cat/nodes?v&h=name,disk.used_percent

This command will list all nodes along with their disk usage percentage. Identify nodes with disk usage close to or at 100%.

Step 2: Free Up Disk Space

Once you've identified the affected node, consider the following actions to free up disk space:

  • Delete Unnecessary Data: Remove old or unnecessary indices using the DELETE API. For example:
    DELETE /index_name
  • Optimize Indices: Use the _forcemerge API to reduce the number of segments in your indices, which can help free up space:
    POST /index_name/_forcemerge?max_num_segments=1

Step 3: Increase Disk Capacity

If freeing up space is not sufficient, consider increasing the disk capacity of the affected node. This may involve adding more storage to the existing node or migrating data to a node with more available space.

Step 4: Monitor Disk Usage

To prevent future occurrences, set up monitoring and alerts for disk usage. Ensure that Prometheus is configured to alert you when disk usage exceeds a certain threshold, allowing you to take proactive measures before the disk becomes full.

Additional Resources

For more information on managing OpenSearch clusters, consider visiting the following resources:

Master 

OpenSearch Cluster Node Disk Full

 debugging in Minutes

— Grab the Ultimate Cheatsheet

(Perfect for DevOps & SREs)

Most-used commands
Real-world configs/examples
Handy troubleshooting shortcuts
Your email is safe with us. No spam, ever.

Thankyou for your submission

We have sent the cheatsheet on your email!
Oops! Something went wrong while submitting the form.

OpenSearch Cluster Node Disk Full

Cheatsheet

(Perfect for DevOps & SREs)

Most-used commands
Your email is safe thing.

Thankyou for your submission

We have sent the cheatsheet on your email!
Oops! Something went wrong while submitting the form.

MORE ISSUES

Deep Sea Tech Inc. — Made with ❤️ in Bangalore & San Francisco 🏢

Doctor Droid