我的警报管理器配置为
config:
global:
resolve_timeout: 5m
route:
group_by: ['alert_manager_group_by']
group_wait: 30s
group_interval: 15m
repeat_interval: 30m
receiver: 'alerting-metrics'
routes:
- receiver: 'alerting-metrics'
matchers:
- group =~ "TestGroup"
continue: true
- receiver: 'teams'
matchers:
- group =~ "TestGroup"
continue: true
- receiver: 'null'
matchers:
- alertname =~ "InfoInhibitor|Watchdog"
receivers:
- name: 'null'
- name: 'alerting-metrics'
webhook_configs:
- send_resolved: false
url: 'my webhook'
- name: 'teams'
webhook_configs:
- send_resolved: false
url: http://prometheus-msteams:2000/high_priority_channel
我有一个接收器,我可以在其中接收警报,并且我有普罗米修斯计数器来获取已触发警报数量的统计数据。
"alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 3:45:13.104 PM","ts=2024-03-18T15:45:13.104Z caller=notify.go:743 level=debug component=dispatcher receiver=alerting-metrics integration=webhook[0] msg=""Notify success"" attempts=1" 45 minutes after last notification as no new alert was added to the group (group_wait + repeat_interval)
上次通知后约 30 分钟后,新警报的状态为已解决
"alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 4:15:12.988 PM","ts=2024-03-18T16:15:12.988Z caller=dispatch.go:163 level=debug component=dispatcher msg=""Received alert"" alert=Prometheus_P01_Alert[6fdc8ff][resolved]"
"alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 4:15:13.063 PM","ts=2024-03-18T16:15:13.063Z caller=dispatch.go:515 level=debug component=dispatcher aggrGroup=""{}/{group=~\""TestGroup""}:{alert_manager_group_by=\""Front_Door\""}"" msg=flushing alerts=[Prometheus_P01_Alert[6fdc8ff][resolved]]"
"alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 4:15:13.063 PM","ts=2024-03-18T16:15:13.063Z caller=dispatch.go:515 level=debug component=dispatcher aggrGroup=""{}/{group=~\""TestGroup""}:{alert_manager_group_by=\""Front_Door\""}"" msg=flushing alerts=[Prometheus_P01_Alert[6fdc8ff][resolved]]"
新警报已激活
alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 4:16:42.989 PM","ts=2024-03-18T16:16:42.989Z caller=dispatch.go:163 level=debug component=dispatcher msg=""Received alert"" alert=Prometheus_P01_Alert[6fdc8ff][active]"
30秒后发送通知
"alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 4:17:12.990 PM","ts=2024-03-18T16:17:12.990Z caller=dispatch.go:515 level=debug component=dispatcher aggrGroup=""{}/{group=~\""TestGroup"}:{alert_manager_group_by=\""Front_Door\""}"" msg=flushing alerts=[Prometheus_P01_Alert[6fdc8ff][active]]"
"alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 4:17:12.990 PM","ts=2024-03-18T16:17:12.990Z caller=dispatch.go:515 level=debug component=dispatcher aggrGroup=""{}/{group=~\""TestGroup""}:{alert_manager_group_by=\""Front_Door\""}"" msg=flushing alerts=[Prometheus_P01_Alert[6fdc8ff][active]]"
"alertmanager-monitoring-emea-dev-kube-p-alertmanager-0","3/18/2024, 4:17:13.016 PM","ts=2024-03-18T16:17:13.016Z caller=notify.go:743 level=debug component=dispatcher receiver=alerting-metrics integration=webhook[0] msg=""Notify success"" attempts=1" notification was sent regardless of when the last notification was sent
问题:这是正常行为吗? Alertmanager是否重置repeat_interval?