Alerts Configuration using Prometheus

Alerts Configuration using Prometheus

To configure alerts in Prometheus based on specific metric thresholds and display those alerts in Grafana, follow these steps:

Step 1: Configure Prometheus Alerts

  1. Edit the Prometheus Configuration: Open your prometheus.yml configuration file and add an alerting section along with your alerting rules. For example, you can set up a rule to trigger an alert when CPU usage exceeds 80%.

    Here’s an example of how to set this up:

     global:
       scrape_interval: 15s  # Default scrape interval
       evaluation_interval: 1m  # How often to evaluate rules
    
     scrape_configs:
       - job_name: 'node-app'
         static_configs:
           - targets: ['localhost:3000']
    
     rule_files:
       - "alerts.yml"  # Reference to the alerts file
    
  2. Create an Alerting Rule File: Create a new file named alerts.yml in the same directory as your prometheus.yml. Add the following alert rule to this file:

     groups:
     - name: example_alerts
       rules:
       - alert: HighCpuUsage
         expr: (1 - avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m])) > 0.8)
         for: 5m
         labels:
           severity: critical
         annotations:
           summary: "High CPU Usage"
           description: "CPU usage is above 80% for more than 5 minutes."
    

    This rule triggers an alert called HighCpuUsage if the average CPU usage over the last 5 minutes is greater than 80% for 5 consecutive minutes.

  3. Restart Prometheus: After modifying the configuration files, restart Prometheus to apply the changes:

     ./prometheus --config.file=prometheus.yml
    

Although you can configure alerts in Prometheus directly, using Alertmanager allows you to manage and route alerts effectively.

  1. Install Alertmanager: Download Alertmanager from the Prometheus download page and extract it:

     wget https://github.com/prometheus/alertmanager/releases/latest/download/alertmanager-<version>.linux-amd64.tar.gz
     tar xvf alertmanager-<version>.linux-amd64.tar.gz
     cd alertmanager-<version>.linux-amd64
    
  2. Configure Alertmanager: Create a configuration file named alertmanager.yml with the following content:

     global:
       resolve_timeout: 5m
    
     route:
       group_by: ['alertname']
       group_wait: 30s
       group_interval: 5m
       repeat_interval: 3h
       receiver: 'default'
    
     receivers:
     - name: 'default'
       webhook_configs:
       - url: 'http://localhost:5000'  # Configure your webhook URL or email here
    
  3. Start Alertmanager: Run Alertmanager with the following command:

     ./alertmanager --config.file=alertmanager.yml
    
  4. Modify Prometheus Configuration to Use Alertmanager: Update your prometheus.yml to include Alertmanager:

     alerting:
       alertmanagers:
       - static_configs:
         - targets:
           - localhost:9093  # Default Alertmanager port
    
  5. Restart Prometheus Again: Restart Prometheus to apply the new configurations.

Step 3: Display Alerts in Grafana

  1. Access Grafana: Open your web browser and go to http://localhost:3000.

  2. Add Alerting Data Source: Go to Configuration (gear icon) > Data Sources > Add data source and select Prometheus.

    Make sure the HTTP URL is set to your Prometheus server (e.g., http://localhost:9090) and click Save & Test.

  3. Create a New Dashboard for Alerts:

    • Click on the Plus icon (➕) in the left sidebar and select Dashboard.

    • Click on Add new panel.

    • Use the following query to fetch alert information:

    ALERTS{alertstate="firing"}
  1. Choose Visualization:

    • Change the visualization type (e.g., table, graph) depending on how you want to display alerts.

    • Configure the panel settings to include columns such as alertname, severity, summary, and description.

  2. Save the Dashboard: Click the disk icon to save your dashboard.

Step 4: Test Alerts

To test the alert, you can artificially increase the CPU usage of your Node.js application or use a load testing tool to simulate high CPU usage. Once the CPU usage exceeds the defined threshold for the specified duration, the alert should be triggered, and you should see it reflected in your Grafana dashboard.

Conclusion

You have successfully configured Prometheus to trigger alerts based on specific metric thresholds and displayed those alerts in Grafana.