OpenSearch High CPU Usage

The CPU usage on the OpenSearch nodes is consistently above the threshold.

Understanding OpenSearch

OpenSearch is a powerful, open-source search and analytics suite derived from Elasticsearch. It is designed to provide a robust, scalable, and secure solution for searching, visualizing, and analyzing large volumes of data in real-time. OpenSearch is often used for log analytics, full-text search, and operational intelligence.

Symptom: High CPU Usage

The Prometheus alert indicating High CPU Usage is triggered when the CPU utilization on OpenSearch nodes consistently exceeds a predefined threshold. This alert is crucial as it can impact the performance and responsiveness of the OpenSearch cluster.

Understanding the High CPU Usage Alert

High CPU usage in OpenSearch can be symptomatic of several underlying issues. It may indicate that the cluster is under heavy load, possibly due to inefficient queries, insufficient resources, or misconfigured settings. Persistent high CPU usage can lead to slower query responses and degraded performance.

Common Causes of High CPU Usage

  • Resource-intensive queries or aggregations.
  • Insufficient hardware resources allocated to the cluster.
  • Background processes such as garbage collection or indexing.
  • Misconfigured cluster settings or plugins.

Steps to Resolve High CPU Usage

1. Identify Resource-Intensive Queries

Use the _tasks API to identify long-running or resource-intensive queries:

GET _tasks?detailed=true&actions=*search&nodes=*

Analyze the output to identify queries that are consuming excessive CPU resources. Consider optimizing these queries by adding filters, reducing the number of shards, or using more efficient query patterns.

2. Optimize Cluster Configuration

Review and optimize the cluster configuration settings. Ensure that the JVM heap size is appropriately set, typically around 50% of the available RAM, but not exceeding 32GB. Adjust the number of replicas and shards to balance the load across nodes.

3. Scale Resources

If the cluster is consistently under heavy load, consider scaling up by adding more nodes or increasing the CPU and memory resources of existing nodes. This can be done by adjusting the configuration in your orchestration tool or cloud provider.

4. Monitor and Adjust Indexing

Check the indexing rate and optimize it if necessary. Use the _cat/indices API to monitor indexing performance:

GET _cat/indices?v

Consider using bulk indexing to reduce the load on the cluster.

Additional Resources

For more detailed guidance on optimizing OpenSearch performance, refer to the OpenSearch Documentation. Additionally, the Prometheus Documentation provides insights into setting up and managing alerts effectively.

Try DrDroid: AI Agent for Production Debugging

80+ monitoring tool integrations
Long term memory about your stack
Locally run Mac App available

Thank you for your submission

We have sent the cheatsheet on your email!
Oops! Something went wrong while submitting the form.
Read more
Time to stop copy pasting your errors onto Google!

Try DrDroid: AI Agent for Debugging

80+ monitoring tool integrations
Long term memory about your stack
Locally run Mac App available

Thankyou for your submission

We have sent the cheatsheet on your email!
Oops! Something went wrong while submitting the form.

Thank you for your submission

We have sent the cheatsheet on your email!
Oops! Something went wrong while submitting the form.
Read more
Time to stop copy pasting your errors onto Google!

MORE ISSUES

Deep Sea Tech Inc. — Made with ❤️ in Bangalore & San Francisco 🏢

Doctor Droid