我正在使用Nagios的check_logs.pl文件来检查日志文件中是否存在for运行中的任何错误。错误记录在文件/var/log/puppet/error.log
中。配置文件是
/usr/local/nagios/custom/check_puppet.cfg
文件的内容是
[root@prod nagios] cat /usr/local/nagios/custom/check_puppet.cfg
$seek_file_template='/var/log/puppet/$log_file.puppet-agent.check_log.seek';
@log_files =
(
{
'file_name' => '/var/log/puppet/error.log',
'reg_exp' => '(Could not send report:|Could not retrieve file metadata for |Could not retrieve catalog from remote server|Could not retrieve catalog; skipping run|Error 500 on SERVER:)',
},
);
问题是无论错误文件的内容如何,我都可以正常运行。即,日志文件包含错误,但输出仍然可以。任何想法为什么会这样??
错误文件的内容。
[root@prod nagios] cat /var/log/puppet/error.log
Dec 25 11:13:12 prod puppet-agent: Could not retrieve catalog; skipping run
Dec 25 11:33:53 prod puppet-agent: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, Could not find data item internalrepo::prod::repo_server in any Hiera data file and no default supplied at /etc/puppet/modules/prod-modules/manifests/params.pp:112:26 on node prod.maker.com
Dec 25 11:33:53 prod puppet-agent: Could not retrieve catalog; skipping run
文件权限是用户nagios,组nagios。
[root@prod nagios] ls -l | grep check_puppet
total 164
-rw-r-----. 1 nagios nagios 469 Dec 25 05:59 check_puppet.cfg
样品运行:
[root@prod nagios] /usr/local/nagios/scripts/check_logs.pl -c /usr/local/nagios/custom/check_logs_puppetclient.cfg
puppet_err.log => OK;
我不知道您使用的是哪个版本的插件,但是根据官方documentation,没有指令reg_exp
和file_name
。我认为您应该使用criticalpatterns
,如下所示:
$seek_file_template='/var/log/puppet/$log_file.puppet-agent.check_log.seek';
@log_files =
(
{
'logfile' => '/var/log/puppet/error.log',
'criticalpatterns' => ['Could not send report:', 'Could not retrieve file metadata for ', 'Could not retrieve catalog from remote server', 'Could not retrieve catalog; skipping run', 'Error 500 on SERVER:'],
},
);