Production-Ready Template

Proactive MongoDB Monitoring with Prometheus Alert Templates

Monitoring your MongoDB infrastructure with Prometheus is critical for ensuring performance, reliability, and early detection of anomalies. This blog explores the alerting templates provided in the MongoDB alert rule set from DrDroidLab's Prometheus Alert Templates repository. We’ll walk through the types of alerts defined, deep dive into key rules, and provide guidance on setup and tuning.

Get Template

Core Alert Rule

MongoDown

Critical Performance Bottleneck

mongodb_up == 0

Why this matters

This alert fires when the MongoDB exporter reports that MongoDB is unreachable. It generally indicates MongoDB is either down, misconfigured, or experiencing network issues.

Tuning tips

Use 'for: 2m' to reduce noise during restarts or transient network blips. Consider excluding known maintenance windows with Alertmanager routing logic.

MongoReplicationLag

Operations blocking event loop

mongodb_mongod_replset_member_optime_date{state="SECONDARY"} < ignoring(state) (mongodb_mongod_replset_member_optime_date{state="PRIMARY"} - 60)

Why this matters

Detects replication lag greater than 60 seconds between primary and secondary members. Indicates issues with data sync which could lead to stale reads or failover delays.

Tuning tips

Adjust the lag threshold based on your application’s tolerance, e.g., 30s for real-time applications or 120s for batch workloads. Monitor trends to detect performance degradation.

MongoReplicationStateNotPrimary

Memory efficiency warning

min(mongodb_mongod_replset_member_state{state="PRIMARY"}) != 1

Why this matters

This checks if *any* MongoDB instance is currently in a PRIMARY state. If not, failover has occurred or the replica set is unhealthy.

Tuning tips

Set a short 'for' duration (e.g., 1m) to quickly respond to failover events, but buffer slightly to avoid temporary election noise during restarts.

MongoReplicationStateNotSecondary

Service availability check

count(mongodb_mongod_replset_member_state{state="SECONDARY"}) < 1

Why this matters

Ensures there is at least one SECONDARY in the replica set. This confirms high availability and fault tolerance are maintained.

Tuning tips

Use this alert in clusters with redundancy requirements. In smaller dev/test environments, consider disabling or loosening thresholds.

MongoNumberCursorsOpenHigh

Service availability check

mongodb_mongod_metrics_cursor_open{state="total"} > 10000

Why this matters

Indicates the number of open MongoDB cursors exceeds 10,000. Could signal long-lived queries or resource leaks affecting memory and performance.

Tuning tips

Baseline your typical open cursor count and adjust the threshold accordingly (e.g., high-throughput systems may need a higher threshold). Include query profiling if this alert fires frequently.

Service availability check

Why this matters

Tuning tips

Clone the repo: git clone https://github.com/DrDroidLab/prometheus-alert-templates.git

Service availability check

Why this matters

Tuning tips

Service availability check

Why this matters

Tuning tips

Quick Setup

Navigate to the mongoDB folder: cd prometheus-alert-templates/mongoDB

Copy the alert rule file into your Prometheus alerting rules directory

Reload Prometheus configuration or restart the Prometheus server

Validate alert rules are active using Prometheus UI or API

Do these alerts require the MongoDB exporter?

Frequently Asked Questions

Ready to Get Started?

Start monitoring MongoDB effectively with these battle-tested Prometheus alert templates. Clone and customize the alert rules from the GitHub repo at https://github.com/DrDroidLab/prometheus-alert-templates/blob/master/mongoDB to match your alerting strategy and operational maturity.

Get Template