我在这里看了这个问题/答案:Prometheus AlertManager - Send Alerts to different clients based on routes
对我来说这是一个非常好的开始,我希望我能在那里给回答者一个快速的问题,但我没有代表。
无论如何,我有一个alert.rules.yml文件,有两组,看起来像:
groups:
- name: DevOpsAlerts
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes. ({{ $value }} minutes)"
- alert: InstanceHighCpu
expr: 100 - (avg by (host) (irate(node_cpu{mode="idle"}[5m])) * 100) > 5
for: 10m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.host }}: CPU High"
description: "{{ $labels.host }} has high CPU activity"
- name: TestTeam2
rules:
- alert: - alert: InstanceLowMemory
expr: node_memory_MemAvailable < 268435456
for: 10m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.host }}: memory low"
description: "{{ $labels.host }} has less than 256M memory available"
- alert: InstanceLowDisk
expr: node_filesystem_avail{mountpoint="/"} < 1073741824
for: 10m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.host }}: low disk space"
description: "{{ $labels.host }} has less than 1G FS space"
除此之外,我还有一个类似于alertmanager.yml的文件
global:
smtp_smarthost: 'smtpserver'
smtp_from: '[email protected]'
smtp_auth_username: '[email protected]'
smtp_auth_password: 'verystrongpassword'
smtp_require_tls: maybe
route:
group_by: ['alertname', 'cluster', 'service']
#default receiver
receiver: DevOps
routes:
- match:
alertname: InstanceDown
receiver: DevOps
- match:
group: InstanceHighCpu
receiver: test-team-1
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'cluster', 'service']
receivers:
- name: DevOps
email_configs:
# - to: [email protected]
- name: test-team-1
email_configs:
- to: [email protected] #This can be any email specified from the team
- name: team-name-2
email_configs:
- to: [email protected] #This can be any email specified from the team
因此,根据我收集的内容,我可以通过从警报规则文件中指定警报名称并将其路由到特定接收器,将警报路由到特定接收器组。
我真正面临的一个重要问题是:是否有办法根据组名称将警报路由到特定接收器,而不是警报规则文件中的警报名称。
而不是
routes:
- match:
alertname: InstanceDown
receiver: DevOps
是否有某种实施方式:
routes:
- match:
group: DevOpsAlerts
receiver: DevOps
我一直在网上寻找某种类似的例子,但我找不到任何东西。谢谢。
规则组名称不会暴露给Alertmanager,它们主要用于Prometheus方面的调试。
您可以做的是为每个警报添加一个group: DevOpsAlerts
标签。