Introduction to the Problem
Your IT systems might run smoothly every day, but what happens when an incident strikes? A server abruptly stops, a critical service fails, or system resources hit dangerous levels. Without a reliable alert system, you might only discover issues when users start complaining. By then, the consequences could already be very serious.
Monitoring is the first and essential step to maintaining a stable system. However, monitoring alone is not enough. You need an effective alert system that provides instant notifications, anytime, anywhere. This article guides you on how to configure Alertmanager – a core component of Prometheus – to send alerts via SMS and Telegram. These are two fast and reliable notification channels, highly suitable for administrators.
When I first started setting up monitoring, I also experienced ‘alert fatigue’ – being overwhelmed by countless unimportant alerts. This almost made me miss truly critical incidents. I spent a lot of time adjusting thresholds and configuring Alertmanager. The goal was to create a balanced alert system that only notifies when truly necessary. Therefore, in addition to the basic configuration guide, I will also share experiences to help you avoid ‘alert fatigue’ from the start.
Core Concepts
To set up effective alerts, first, it’s essential to understand the main components involved in this process.
Prometheus: The Heart of the Monitoring System
Although this article focuses on Alertmanager, we cannot overlook Prometheus. It is an open-source monitoring and alerting system, specializing in collecting metrics from configured targets. When a metric exceeds a threshold, Prometheus generates an alert. However, it does not send them directly but forwards them to Alertmanager.
Alertmanager: The Brain of Alert Processing
Alertmanager is an independent component. It receives alerts from Prometheus (or other sources), then processes them based on pre-defined rules. The main functions of Alertmanager include:
- Grouping: Groups multiple similar alerts into a single notification. For example, if 10 servers simultaneously report disk space errors, Alertmanager will send one common notification instead of 10 individual messages.
- Inhibition: Suppresses dependent alerts when a primary alert is triggered. For instance, if the main server reports a loss of connection, Alertmanager will automatically prevent alerts about services running on that server.
- Silencing: Allows temporarily disabling alerts for a specific period (e.g., during system maintenance).
- Routing: Sends alerts to different receivers, based on alert labels. Receivers can include email, Slack, PagerDuty, webhooks. This article will focus on Telegram and SMS.
Why SMS and Telegram?
In a modern system environment, receiving timely alerts is a crucial factor.
- Telegram: As a popular messaging app, Telegram provides a flexible bot API, making integration for sending notifications easy. Key advantages: completely free, can send more information than email, and notifications are almost instant. This is a very convenient tool for technical teams.
- SMS: Although seemingly ‘classic’, SMS remains an extremely reliable notification channel. Especially in emergency situations, when there is no internet connection or messaging applications encounter issues, SMS still works. It is available on every mobile phone and does not require a special application. SMS is a critical fallback channel for the most severe alerts.
Detailed Practice: Configuring Alerts with Alertmanager
To begin, I will assume you already have Prometheus and Alertmanager running. If not, you can refer to the Prometheus and Grafana installation guides on itfromzero.com, or use Docker for a quick setup.
1. Telegram Notification Configuration
Telegram is a popular alert channel due to its flexibility and being completely free.
Step 1: Create a Telegram Bot and get the Bot Token
- Open the Telegram app, search for
@BotFather. - Start a conversation with
@BotFatherand type/newbot. - Follow the instructions to name your bot (e.g.,
itfromzero_alert_bot) and choose a unique username (e.g.,itfromzero_alert_bot). - Once completed,
@BotFatherwill provide anHTTP API Token. Store this token carefully, for example:123456789:ABCDEFGH-IJKLMN_OPQRSTUVXYZ.
Step 2: Get the Chat ID of a group or user
Alertmanager needs to know where to send messages. This can be an individual or a group.
- Search for your bot on Telegram and start a conversation with it, or add the bot to a group chat where you want to receive alerts.
- Send any message to the bot or in the group chat containing the bot.
- Open your browser and access the following URL (replace
<BOT_TOKEN>with your token):https://api.telegram.org/bot<BOT_TOKEN>/getUpdates - You will receive a JSON response. Look for the
chatandidfields. This is thechat_idyou need. If it’s a group, the ID will be a negative number (e.g.,-123456789).
Step 3: Edit Alertmanager Configuration (alertmanager.yaml)
Add or edit the receivers and routes sections in your alertmanager.yaml file.
# alertmanager.yaml
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
receiver: 'default-receiver' # Default receiver configuration
routes:
- match:
severity: 'critical' # Alerts with 'critical' severity
receiver: 'telegram-critical'
- match:
severity: 'warning' # Alerts with 'warning' severity
receiver: 'telegram-warning'
# You can add other rules here
receivers:
- name: 'default-receiver'
# You can configure a default receiver or leave it empty if you don't want default alerts.
# telegram_configs:
# - chat_id: '<CHAT_ID_DEFAULT>'
# parse_mode: 'HTML'
- name: 'telegram-critical'
telegram_configs:
- bot_token: '<BOT_TOKEN_CUA_BAN>'
chat_id: '<CHAT_ID_NHOM_CRITICAL>' # Chat ID for group/user receiving critical alerts
parse_mode: 'HTML'
send_resolved: true # Send notification when alert is resolved
- name: 'telegram-warning'
telegram_configs:
- bot_token: '<BOT_TOKEN_CUA_BAN>'
chat_id: '<CHAT_ID_NHOM_WARNING>' # Chat ID for group/user receiving warning alerts
parse_mode: 'HTML'
send_resolved: true
Note: I use two different chat_ids for critical and warning to illustrate routing capabilities. You can certainly use the same chat_id for all alerts if desired. Replace <BOT_TOKEN_CUA_BAN>, <CHAT_ID_NHOM_CRITICAL>, <CHAT_ID_NHOM_WARNING> with your actual information.
Step 4: Reload Alertmanager Configuration
After changing the alertmanager.yaml file, you need to reload the configuration for Alertmanager to apply the changes.
# If you are running Alertmanager as a service
sudo systemctl reload alertmanager
# Or via API if enabled (recommended)
curl -XPOST http://localhost:9093/-/reload
Step 5: Test Telegram Alerts
To test, you need to create a dummy alert in Prometheus or trigger a real alert situation.
Example Prometheus configuration to create test alerts:
Add to prometheus.yml (or your rule file):
# prometheus.yml
# ...
alerting:
alertmanagers:
- static_configs:
- targets: ['localhost:9093'] # Your Alertmanager address
rule_files:
- "alert_rules.yml" # Ensure this file is read by Prometheus
# ...
alert_rules.yml file:
# alert_rules.yml
groups:
- name: general.rules
rules:
- alert: HighLoadTest
expr: node_load1 > 0.01 # Change this threshold for easy triggering
for: 1s
labels:
severity: 'critical' # To match the telegram-critical route
annotations:
summary: "Server {{ $labels.instance }} is experiencing high load (test)"
description: "Average load over 1 minute is {{ $value }} on {{ $labels.instance }}."
- alert: LowDiskSpaceTest
expr: node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_avail_bytes{mountpoint="/"} * 100 < 90 # Below 90% free space
for: 1s
labels:
severity: 'warning' # To match the telegram-warning route
annotations:
summary: "Disk {{ $labels.mountpoint }} on {{ $labels.instance }} is almost full (test)"
description: "Only {{ $value | humanizePercentage }} free space remaining on {{ $labels.instance }}."
Reload Prometheus configuration:
curl -XPOST http://localhost:9090/-/reload
If everything is configured correctly, you will receive Telegram messages as soon as these alert conditions are triggered.
2. SMS Notification Configuration (Via Webhook and custom script)
Alertmanager does not have direct SMS integration. Therefore, we will use a webhook combined with a custom script to call an SMS Gateway service. This method offers high flexibility and customization.
Step 1: Understand Webhook and SMS Gateway Mechanisms
- Webhook: Alertmanager will send an HTTP POST request to a URL you specify when an alert occurs. This request contains detailed alert information in JSON format.
- Custom Script/Service: You will run a small application (e.g., written in Python with Flask). This application will listen for POST requests from Alertmanager. Upon receiving a request, the script will parse the JSON, extract the necessary information, and call the API of an SMS provider (like Twilio, Nexmo, or a local SMS service).
Step 2: Prepare an SMS Gateway Service (Conceptual)
For simplicity, I will present a basic Python script. In reality, you will need to integrate it with an SMS provider’s API. This script only illustrates how to receive webhooks and process information.
Suppose you have an sms_gateway.py file like this:
# sms_gateway.py (This is an illustrative script; you need to develop it according to your SMS Provider)
from flask import Flask, request, jsonify
import os
app = Flask(__name__)
# Replace with your actual phone number for receiving SMS
TARGET_PHONE_NUMBER = os.environ.get("TARGET_PHONE_NUMBER", "+849xxxxxxxx")
@app.route('/sms-alert', methods=['POST'])
def send_sms_alert():
try:
alert_data = request.get_json()
# Process alert information from Alertmanager
# You need to customize how you want the SMS content to be displayed
for alert in alert_data.get('alerts', []):
alertname = alert['labels'].get('alertname', 'Unknown Alert')
severity = alert['labels'].get('severity', 'info')
instance = alert['labels'].get('instance', 'Unknown Instance')
summary = alert['annotations'].get('summary'!, 'No Summary')
# Create SMS message content
sms_message = f"[ITFZS] {severity.upper()} - {alertname} on {instance}: {summary}"
print(f"Sending SMS to {TARGET_PHONE_NUMBER}: {sms_message}")
# --- Here, you will call your SMS provider's API ---
# Example with a hypothetical API:
# import requests
# sms_api_url = "https://api.sms_provider.com/send"
# payload = {
# "to": TARGET_PHONE_NUMBER,
# "message": sms_message,
# "api_key": os.environ.get("SMS_API_KEY")
# }
# response = requests.post(sms_api_url, json=payload)
# if response.status_code == 200:
# print("SMS sent successfully!")
# else:
# print(f"Failed to send SMS: {response.status_code} - {response.text}")
# ---
# In this example, we just print to the console
print("SMS alert processed (conceptual).")
return jsonify({"status": "success", "message": "SMS alert processed conceptually."}), 200
except Exception as e:
print(f"Error processing SMS alert: {e}")
return jsonify({"status": "error", "message": str(e)}), 500
if __name__ == '__main__':
# Run this script on a port (e.g., 9099)
# Ensure it is accessible from Alertmanager
print("Starting SMS Gateway Mockup on port 9099...")
app.run(host='0.0.0.0', port=9099)
To run this script, you need to install Flask (pip install Flask) and run it:
export TARGET_PHONE_NUMBER="+849xxxxxxxx" # Replace with your actual phone number
python sms_gateway.py
Ensure this script runs continuously and is accessible from Alertmanager (on the same server, or over the network if on a different server).
Step 3: Edit Alertmanager Configuration (alertmanager.yaml) for SMS
Add a new receiver using webhook_configs to point to your custom SMS Gateway script.
# alertmanager.yaml (Add to the same Alertmanager configuration file above)
# ...
route:
# ... (Existing route section)
routes:
- match:
severity: 'critical'
receiver: 'sms-critical' # Route critical alerts to SMS
- match:
severity: 'emergency' # Example: add an extremely urgent level
receiver: 'sms-critical' # Emergency alerts also sent via SMS
# ... (Other routes)
receivers:
# ... (Existing Telegram receivers)
- name: 'sms-critical'
webhook_configs:
- url: 'http://localhost:9099/sms-alert' # URL of the custom SMS Gateway script
send_resolved: true
# You can configure http_config if your script requires authentication
# http_config:
# basic_auth:
# username: 'smsuser'
# password: 'smspassword'
Replace http://localhost:9099/sms-alert with the actual address where your sms_gateway.py script is running.
Step 4: Reload Alertmanager Configuration and Test
Reload the Alertmanager configuration as you did for Telegram.
curl -XPOST http://localhost:9093/-/reload
You can update alert_rules.yml in Prometheus. Create an alert with severity: 'critical' (or emergency) to test. Check if the sms_gateway.py script receives the webhook and prints the notification. If you see the line Sending SMS to ... in the sms_gateway.py console, it means Alertmanager successfully sent the webhook. The next step is to integrate this script with a real SMS provider’s API.
3. Optimizing Alerts and Avoiding “Alert Fatigue”
As shared, ‘alert fatigue’ is a big problem. Alertmanager provides many useful features to manage the alert flow.
-
Grouping:
Usegroup_by,group_wait,group_interval,repeat_intervalin theroute. This helps you avoid being ‘spammed’ when many similar alerts appear simultaneously.route: group_by: ['alertname', 'instance', 'severity'] # Group by alert name, instance, and severity level group_wait: 30s # Wait 30 seconds to collect more alerts before sending group_interval: 5m # If new alerts appear in the group, wait another 5 minutes before re-sending repeat_interval: 4h # Repeat alerts every 4 hours if not yet resolved receiver: 'default-receiver'With this configuration, Alertmanager will group alerts related to the same issue on the same instance. As a result, the number of notifications you receive will be significantly reduced.
-
Inhibition:
Suppress ‘secondary’ alerts when a ‘primary’ alert has triggered. For example, if an entire server has an issue, you only want to be notified about that server. You wouldn’t want to receive dozens of alerts about services running on it.# alertmanager.yaml inhibit_rules: - source_match: severity: 'critical' # Server down alert has 'critical' severity target_match: severity: 'warning' # Service alert has 'warning' severity equal: ['instance'] # Apply if they occur on the same instance # This rule states: if there is a 'critical' alert on a specific 'instance', # then suppress all other 'warning' alerts on that same 'instance'. -
Silences:
When you know about a maintenance event or a temporary issue in advance, create asilencevia Alertmanager’s web interface (usuallyhttp://localhost:9093). This feature temporarily stops alerts for a specified period, which is very useful when you are actively troubleshooting or upgrading the system.
By intelligently utilizing these features, you can build an effective alert system. This system will focus only on core information, helping you avoid ‘alert fatigue’ and react faster to any incidents.
Conclusion
Setting up an effective monitoring and alerting system is vital for any stable IT system. In this article, you learned how to configure Alertmanager to send alerts to Telegram and SMS. These are two powerful and reliable notification channels.
Telegram integration helps your team receive fast, free, and comprehensive notifications. Meanwhile, SMS acts as the final alert layer. It ensures that even in the most extreme situations, you remain informed about incidents.
Start configuring today to improve your response capabilities and maintain the stability of your system. A properly configured alert system not only gives you peace of mind but also demonstrates a professional and reliable IT infrastructure.

