对于输出,需要将括号内的数字替换为句点“。”。还要删除域开头和结尾的括号。
我们可以为此使用re.sub吗?如果可以,如何使用?
code
import re
log = ["4/19/2020 11:59:09 PM 2604 PACKET 0000014DE1921330 UDP Rcv 192.168.1.28 f975 Q [0001 D NOERROR] A (7)pagead2(17)googlesyndication(3)com(0)",
"4/19/2020 11:59:09 PM 0574 PACKET 0000014DE18C4720 UDP R cv 192.168.2.54 9c63 Q [0001 D NOERROR] A (2)pg(3)cdn(5)viber(3)com(0)"]
rx_dict = { 'query': re.compile(r'(?P<query>[\S]*)$') }
for item in log:
for key, r_exp in rx_dict.items():
print(f"{r_exp.search(item).group(1)}")
输出
(7)pagead2(17)googlesyndication(3)com(0)
(2)pg(3)cdn(5)viber(3)com(0)
首选输出
pagead2.googlesyndication.com
pg.cdn.viber.com
实用的python用法:
log = ["4/19/2020 11:59:09 PM 2604 PACKET 0000014DE1921330 UDP Rcv 192.168.1.28 f975 Q [0001 D NOERROR] A (7)pagead2(17)googlesyndication(3)com(0)",
"4/19/2020 11:59:09 PM 0574 PACKET 0000014DE18C4720 UDP R cv 192.168.2.54 9c63 Q [0001 D NOERROR] A (2)pg(3)cdn(5)viber(3)com(0)"]
import re
urls = [re.sub(r'\(\d+\)','.',t.split()[-1]).lstrip('.') for t in log]
print (urls)
输出:
['pagead2.googlesyndication.com.', 'pg.cdn.viber.com.']