使用 python 在大日志文件中的模式之间搜索字符串并打印或 grep 行

问题描述 投票:0回答:2

使用 python 在大日志文件中搜索字符串并

  1. 在以破折号开头的行之间打印或 grep 行
  2. 还只打印有日期的第一行

这是我拥有的日志文件中的段落,需要帮助来搜索字符串“AMQ6209W”,然后打印带有破折号的行之间的所有行。字符串和破折号之间的行数根据错误消息而变化,这就是我们不能真正使用那里的行数进行 grep 的原因。为了看起来像真实的日志,您可以追加这几次并将 AMQ 错误更新为 AMQ1111W 和 AMQ1112W,以便您可以找到 AMQ6209W。

----- amqxfdcx.c : 1145 -------------------------------------------------------
02/05/2024 12:15:45 AM - Process(58674.2) User(mqm) Program(runmqlsr)
                    Host(abc.com) Installation(Installation1)
                    VRMF(7.5.0.15) QMgr(TEST)
                    Time(2024-02-05T05:15:45.505Z)
                    ArithInsert1(1) ArithInsert2(1)
                    CommentInsert1(SIGHUP)
                    CommentInsert2(Signal sent by pid 1)
                    CommentInsert3(systemd)

AMQ6209W: An unexpected asynchronous signal (1 : SIGHUP) has been received and
ignored.


EXPLANATION:
Process 1 received an unexpected asynchronous signal and ignored it. This has
not caused an error but the source of the signal should be determined as it is
likely that the signal has been generated externally to IBM MQ.
ACTION:
Determine the source of the signal and prevent it from re-occurring.
----- amqxerrx.c : 855 --------------------------------------------------------

我尝试过这个,但我没有取得任何进展:

from itertools import islice
index = 0
with open("/home/AMQERR01.LOG", "r") as f:
    for line in f:
        index += 1
        if "AMQ6209W" in line:
            print("hello")
            f.seek(0)
            print("".join(islice(f, index - 10, index + 8)))
            print("hello1")
python search
2个回答
0
投票

如果我们在列表中列出了日志文件的行,我们可以使用

itertools.groupby
来仅查找
---
行之间的行。

[
  list(v) 
  for k, v in itertools.groupby(lines, key=lambda line: line.startswith('---')) 
  if not k
]

输出:

[
  ['02/05/2024 12:15:45 AM - Process(58674.2) User(mqm) Program(runmqlsr)', 
   '                    Host(abc.com) Installation(Installation1)', 
   '                    VRMF(7.5.0.15) QMgr(TEST)', 
   '                    Time(2024-02-05T05:15:45.505Z)', 
   '                    ArithInsert1(1) ArithInsert2(1)', 
   '                    CommentInsert1(SIGHUP)', 
   '                    CommentInsert2(Signal sent by pid 1)', 
   '                    CommentInsert3(systemd)', 
   '', 
   'AMQ6209W: An unexpected asynchronous signal (1 : SIGHUP) has been received and', 

   'ignored.', 
   '', 
   '', 
   'EXPLANATION:', 
   'Process 1 received an unexpected asynchronous signal and ignored it. This has', 
   'not caused an error but the source of the signal should be determined as it is', 
   'likely that the signal has been generated externally to IBM MQ.', 
   'ACTION:', 
   'Determine the source of the signal and prevent it from re-occurring.'
  ]
]

进一步修改它,我们可以拉出您要查找的行:

[
  [
    line 
    for line in lst 
    if 'AMQ6209W' in line
  ] 
  for k, v in itertools.groupby(lines, key=lambda line: line.startswith('---')) 
  for lst in (list(v),) 
  if not k
]

结果:

[
  ['AMQ6209W: An unexpected asynchronous signal (1 : SIGHUP) has been received and']
]

0
投票

面向记录的grep

Grep 是一个方便的实用程序。 它适用于以换行符分隔的记录。

您的记录是多行的,并使用不同的分隔符。 这是进行记录匹配的一种方法。

from io import TextIOWrapper
from typing import Generator
import re


record_boundary_re = re.compile(r"^----- ")


def get_records(f: TextIOWrapper) -> Generator[list[str], None, None]:
    record: list[str] = []
    for line in f:
        if record_boundary_re.match(line):
            if record:
                yield record
            record = []
        record.append(line.rstrip())

    if record:
        yield record


def grep(needle="AMQ6209W", in_file="/tmp/log.txt") -> None:
    with open(in_file) as fin:
        for record in get_records(fin):
            if needle in "\n".join(record):
                for line in record:
                    print(bold(line) if needle in line else line)


def bold(s: str) -> str:
    """Adds ANSI escape codes to emphasize text."""
    return f"\033[1m{s}\033[0m"
© www.soinside.com 2019 - 2024. All rights reserved.