OpenSearch Cluster Node Disk Full
A node's disk is full, preventing further data operations.
Debug opensearch automatically with DrDroid AI →
Connect your tools and ask AI to solve it for you
Understanding OpenSearch
OpenSearch is a powerful, open-source search and analytics suite derived from Elasticsearch. It is designed to provide users with a robust platform for searching, analyzing, and visualizing data in real-time. OpenSearch is commonly used for log analytics, full-text search, and operational monitoring, making it a critical component in many data-driven applications.
Symptom: Cluster Node Disk Full
In OpenSearch, the Cluster Node Disk Full alert is a critical warning indicating that one or more nodes in your cluster have reached their disk capacity limits. This alert is typically triggered by Prometheus, a popular monitoring tool, which continuously checks the health and performance metrics of your OpenSearch cluster.
Details About the Alert
Why This Alert Occurs
The Cluster Node Disk Full alert occurs when the disk space on a node is exhausted. This situation can prevent the node from performing essential data operations, such as indexing new documents or replicating data across the cluster. As a result, the overall performance and reliability of your OpenSearch cluster may be compromised.
Impact on Cluster Operations
When a node's disk is full, it can lead to several issues, including:
- Inability to write new data to the node.
- Potential data loss if the node cannot replicate data to other nodes.
- Increased load on other nodes, potentially leading to further performance degradation.
Steps to Fix the Alert
Step 1: Identify the Affected Node
First, determine which node(s) have full disks. You can use the OpenSearch _cat/nodes API to check the disk usage of each node:
GET _cat/nodes?v&h=name,disk.used_percent
This command will list all nodes along with their disk usage percentage. Identify nodes with disk usage close to or at 100%.
Step 2: Free Up Disk Space
Once you've identified the affected node, consider the following actions to free up disk space:
- Delete Unnecessary Data: Remove old or unnecessary indices using the
DELETEAPI. For example:
DELETE /index_name
- Optimize Indices: Use the
_forcemergeAPI to reduce the number of segments in your indices, which can help free up space:
POST /index_name/_forcemerge?max_num_segments=1
Step 3: Increase Disk Capacity
If freeing up space is not sufficient, consider increasing the disk capacity of the affected node. This may involve adding more storage to the existing node or migrating data to a node with more available space.
Step 4: Monitor Disk Usage
To prevent future occurrences, set up monitoring and alerts for disk usage. Ensure that Prometheus is configured to alert you when disk usage exceeds a certain threshold, allowing you to take proactive measures before the disk becomes full.
Additional Resources
For more information on managing OpenSearch clusters, consider visiting the following resources:
Still debugging? Let DrDroid AI investigate for you →
Connect your tools and debug with AI
Get root cause analysis in minutes
- Connect your existing monitoring tools
- Ask AI to debug issues automatically
- Get root cause analysis in minutes