OpenSearch High CPU Usage
The CPU usage on the OpenSearch nodes is consistently above the threshold.
Debug opensearch automatically with DrDroid AI →
Connect your tools and ask AI to solve it for you
Understanding OpenSearch
OpenSearch is a powerful, open-source search and analytics suite derived from Elasticsearch. It is designed to provide a robust, scalable, and secure solution for searching, visualizing, and analyzing large volumes of data in real-time. OpenSearch is often used for log analytics, full-text search, and operational intelligence.
Symptom: High CPU Usage
The Prometheus alert indicating High CPU Usage is triggered when the CPU utilization on OpenSearch nodes consistently exceeds a predefined threshold. This alert is crucial as it can impact the performance and responsiveness of the OpenSearch cluster.
Understanding the High CPU Usage Alert
High CPU usage in OpenSearch can be symptomatic of several underlying issues. It may indicate that the cluster is under heavy load, possibly due to inefficient queries, insufficient resources, or misconfigured settings. Persistent high CPU usage can lead to slower query responses and degraded performance.
Common Causes of High CPU Usage
- Resource-intensive queries or aggregations.
- Insufficient hardware resources allocated to the cluster.
- Background processes such as garbage collection or indexing.
- Misconfigured cluster settings or plugins.
Steps to Resolve High CPU Usage
1. Identify Resource-Intensive Queries
Use the _tasks API to identify long-running or resource-intensive queries:
GET _tasks?detailed=true&actions=*search&nodes=*
Analyze the output to identify queries that are consuming excessive CPU resources. Consider optimizing these queries by adding filters, reducing the number of shards, or using more efficient query patterns.
2. Optimize Cluster Configuration
Review and optimize the cluster configuration settings. Ensure that the JVM heap size is appropriately set, typically around 50% of the available RAM, but not exceeding 32GB. Adjust the number of replicas and shards to balance the load across nodes.
3. Scale Resources
If the cluster is consistently under heavy load, consider scaling up by adding more nodes or increasing the CPU and memory resources of existing nodes. This can be done by adjusting the configuration in your orchestration tool or cloud provider.
4. Monitor and Adjust Indexing
Check the indexing rate and optimize it if necessary. Use the _cat/indices API to monitor indexing performance:
GET _cat/indices?v
Consider using bulk indexing to reduce the load on the cluster.
Additional Resources
For more detailed guidance on optimizing OpenSearch performance, refer to the OpenSearch Documentation. Additionally, the Prometheus Documentation provides insights into setting up and managing alerts effectively.
Still debugging? Let DrDroid AI investigate for you →
Connect your tools and debug with AI
Get root cause analysis in minutes
- Connect your existing monitoring tools
- Ask AI to debug issues automatically
- Get root cause analysis in minutes