我正在使用Prometheus Blackbox-exporter ICMP设置监视N个系统的UP / DOWN状态。
blackbox-exporter配置:
modules:
icmp:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
Prometheus配置:
global:
scrape_interval: 15s
external_labels:
monitor: 'codelab-monitor'
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['prometheus:9090']
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['192.168.1.29', '987.234.121.1']
labels:
group: 'Build'
- targets: ['161.92.248.21', '161.92.3.185', '10.10.4.18']
labels:
group: 'RND'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackboxexporter:9115
blackbox-exporter探测结果准确且看起来不错注意:结果显示无法到达的目标失败,这看起来不错
Recent Probes
Module Target Result Debug
icmp 192.168.1.29 Failure Logs
icmp 192.168.3.185 Failure Logs
icmp 161.92.248.21 Success Logs
icmp 192.168.4.185 Failure Logs
icmp 987.234.121.1 Failure Logs
icmp 192.168.1.29 Failure Logs
icmp 192.168.3.185 Failure Logs
icmp 161.92.248.21 Success Logs
Prometheus结果不准确。这表明所有目标都在注意:预期结果是失败目标,应该显示为0/1
blackbox (5/5 up)
Endpoint State Labels Last Scrape Scrape Duration Error
http://blackboxexporter:9115/probe
module="icmp" target="161.92.248.21" UP group="RND" instance="161.92.248.21" job="blackbox" 1.43s ago 1.522ms
http://blackboxexporter:9115/probe
module="icmp" target="192.168.1.29" UP group="Build" instance="192.168.1.29" job="blackbox" 5.548s ago 1.501s
http://blackboxexporter:9115/probe
module="icmp" target="192.168.3.185" UP group="RND" instance="192.168.3.185" job="blackbox" 1.944s ago 1.501s
http://blackboxexporter:9115/probe
module="icmp" target="192.168.4.185" UP group="RND" instance="192.168.4.185" job="blackbox" 3.09s ago 1.501s
http://blackboxexporter:9115/probe
module="icmp" target="987.234.121.1" UP group="Build" instance="987.234.121.1" job="blackbox" 2.796s ago 1.506ms
我相信在Prometheus中查询probe_success
指标将为您提供预期的结果。
[通常在使用blackbox-exporter时,我们会根据此指标创建基本的“向上/向下”警报。