正则表达式匹配和替换多行Python的字符串

问题描述 投票:5回答:4

我需要帮助匹配2个字符串并替换为空字符串''。感谢您的帮助,因为我仍然是Python和编码的新手:

crypto pki certificate chain TP-self-signed-1357590403
  +30820330 30820218 A0030201 02020101 300D0609 2A864886 F70D0101 05050030
  +31312F30 2D060355 04031326 494F532D 53656C66 2D536967 6E65642D 43657274
  +69666963 6174652D 31333537 35393034 3033301E 170D3139 30313234 31353436
  +34345A17 0D323030 31303130 30303030 305A3031 312F302D 06035504 03132649
  +4F532D53 656C662D 5369676E 65642D43 65727469 66696361 74652D31 33353735
  +39303430 33308201 22300D06 092A8648 86F70D01 01010500 0382010F 00308201
  +0A028201 0100E69D C133454E 401E763A 7686E453 5D58020D 0E6E122F A0F19E15
  +E0975148 666110BD C1F09B86 CB701C20 EF85E024 F759A921 D11DA10C A13BA3BD
  +20006387 917287CE EA0CFDDC 2FA5DD07 E5B200F4 108CACA1 DCEF0E4E EEE908ED
  +2ACD693B FC90A24F 9F865CB9 859FEFB0 EB8904D4 8FA83D29 E93B892F 32F3EC7D
  +EAA2850E 1793BBCE 86EA47B2 15645634 D81EA89C 1C2BC092 766DF58F 0B289A82
  +0C92E551 7AA9588E F5B41A41 6DB4C785 101E674D BBBCFB42 9F4F9A25 70389515
  +D1C07E2F 18C0557D 95283E90 3CCD2966 5EBF5668 A6B0B847 0B278906 E5BFA668
  +EFBE938A BE70C4C0 1A8D7218 71463EA5 49540A45 DF307B4C 459E657D C039BB68
  +F047B0B2 2F250203 010001A3 53305130 0F060355 1D130101 FF040530 030101FF
  +301F0603 551D2304 18301680 141FADF3 CC2C2293 810EDAA8 9E55327C D2B7D88A
  +88301D06 03551D0E 04160414 1FADF3CC 2C229381 0EDAA89E 55327CD2 B7D88A88
  +300D0609 2A864886 F70D0101 05050003 82010100 91E63F44 376F91C1 C50C08E4
  +B29B902B B1BC7831 C5607897 030835A6 108FC1F2 6F3DEE23 EF3E8FFF 81A121B5
  +26596004 F8F61DFD 1B603C5D 42D850E6 439C7CAE BFC285AE 3FD83870 125594C0
  +51EAAC09 BF42446F C6399B90 D0E10ACA B208819B 645BECE5 DBDDA9AD EBA1FCD9
  +2B14D0DE AB2AC1BF FF064076 ADBB4540 17AB77A4 C6B0DA3B 1BC0F5B8 44030E7B
  +27318CEE 14C90739 DD8684A8 9346EEC1 3F4958EF 835BA822 F58523C9 E9F83105
  +D3E68700 20DAFC5E B1B8CF5B BAC5CEB3 00321088 43125173 51FC8006 270731E6
  +0E0C6183 68BABA99 BD9F4F28 1EDA82D4 F00F1359 F30B6501 BC468C89 49111AB2
  +CBDE5A9D DB8DB33A 45FE6C96 7D49A70F 4C299618

从第一行开始总是有27行

第二是:

crypto pki certificate chain TP-self-signed-1357590403
 -certificate self-signed 01 nvram:IOS-Self-Sig#1.cer
python regex
4个回答
1
投票

您可以使用以下代码:

import re

inputStr = """crypto pki certificate chain TP-self-signed-1357590403
  +30820330 30820218 A0030201 02020101 300D0609 2A864886 F70D0101 05050030
  +31312F30 2D060355 04031326 494F532D 53656C66 2D536967 6E65642D 43657274
  +69666963 6174652D 31333537 35393034 3033301E 170D3139 30313234 31353436
  +34345A17 0D323030 31303130 30303030 305A3031 312F302D 06035504 03132649
  +4F532D53 656C662D 5369676E 65642D43 65727469 66696361 74652D31 33353735
  +39303430 33308201 22300D06 092A8648 86F70D01 01010500 0382010F 00308201
  +0A028201 0100E69D C133454E 401E763A 7686E453 5D58020D 0E6E122F A0F19E15
  +E0975148 666110BD C1F09B86 CB701C20 EF85E024 F759A921 D11DA10C A13BA3BD
  +20006387 917287CE EA0CFDDC 2FA5DD07 E5B200F4 108CACA1 DCEF0E4E EEE908ED
  +2ACD693B FC90A24F 9F865CB9 859FEFB0 EB8904D4 8FA83D29 E93B892F 32F3EC7D
  +EAA2850E 1793BBCE 86EA47B2 15645634 D81EA89C 1C2BC092 766DF58F 0B289A82
  +0C92E551 7AA9588E F5B41A41 6DB4C785 101E674D BBBCFB42 9F4F9A25 70389515
  +D1C07E2F 18C0557D 95283E90 3CCD2966 5EBF5668 A6B0B847 0B278906 E5BFA668
  +EFBE938A BE70C4C0 1A8D7218 71463EA5 49540A45 DF307B4C 459E657D C039BB68
  +F047B0B2 2F250203 010001A3 53305130 0F060355 1D130101 FF040530 030101FF
  +301F0603 551D2304 18301680 141FADF3 CC2C2293 810EDAA8 9E55327C D2B7D88A
  +88301D06 03551D0E 04160414 1FADF3CC 2C229381 0EDAA89E 55327CD2 B7D88A88
  +300D0609 2A864886 F70D0101 05050003 82010100 91E63F44 376F91C1 C50C08E4
  +B29B902B B1BC7831 C5607897 030835A6 108FC1F2 6F3DEE23 EF3E8FFF 81A121B5
  +26596004 F8F61DFD 1B603C5D 42D850E6 439C7CAE BFC285AE 3FD83870 125594C0
  +51EAAC09 BF42446F C6399B90 D0E10ACA B208819B 645BECE5 DBDDA9AD EBA1FCD9
  +2B14D0DE AB2AC1BF FF064076 ADBB4540 17AB77A4 C6B0DA3B 1BC0F5B8 44030E7B
  +27318CEE 14C90739 DD8684A8 9346EEC1 3F4958EF 835BA822 F58523C9 E9F83105
  +D3E68700 20DAFC5E B1B8CF5B BAC5CEB3 00321088 43125173 51FC8006 270731E6
  +0E0C6183 68BABA99 BD9F4F28 1EDA82D4 F00F1359 F30B6501 BC468C89 49111AB2
  +CBDE5A9D DB8DB33A 45FE6C96 7D49A70F 4C299618
crypto pki certificate chain TP-self-signed-1357590403"""

print(re.sub(r'crypto pki certificate chain TP-self-signed-\d+\s*[0-9a-fA-F+\s]+\s*crypto pki certificate chain TP-self-signed-\d+', '' , inputStr))

输出:empty

正则表达式演示:https://regex101.com/r/G9XciA/2/

正则表达式的解释:

  • crypto pki certificate chain TP-self-signed-\d+\s*匹配第一行,其中结尾被认为只是数字后跟任何空格字符
  • [0-9a-fA-F+\s]+将匹配十六进制字符,+和白色空格char
  • crypto pki certificate chain TP-self-signed-\d+\s*最后一行结束匹配。如果第一行和最后一行的ID相同。

使用正则表达式:

crypto pki certificate chain TP-self-signed-(\d+)\s*[0-9a-fA-F+\s]+\s*crypto pki certificate chain TP-self-signed-\1

您对第一个捕获组进行反向引用的位置

但是:ぁzxswい


2
投票

如果要匹配包含下一行的行,则可以匹配所有行并使用否定前瞻来断言下一行不以crypto开头。

然后匹配换行符和加密直到行尾:

https://regex101.com/r/G9XciA/3

^crypto pki certificate chain TP-self-signed-.*(?:\n(?!crypto).*)*\ncrypto.*

如果起始行应与末尾的行相同,则可以使用带有反向引用的第一行的捕获组:

Regex demo

^(crypto pki certificate chain TP-self-signed-.*)(?:\n(?!\1).*)*\n\1

你的代码看起来像

Regex demo

1
投票

为什么不使用这个正则表达式,

pattern = r'^(crypto pki certificate chain TP-self-signed-.*)(?:\n(?!\1).*)*\n\1'
df=re.sub(pattern, '' , file, 0, re.MULTILINE)

并用空字符串删除它?

我是否遗漏了一些观点,因为其他答案似乎暗示了一些涉及换行符的复杂解决方案?

(crypto pki certificate chain TP-self-signed-\d+)[\w\W]+?\1

编辑:根据你的评论“实际上我需要删除:加密pki证书链TP-self-signed-1357590403加上接下来的26行以+开头”

您可以使用此正则表达式,它在Demo行之后选择以+开头的26行。

crypto pki certificate chain TP-self-signed-1357590403

crypto pki certificate chain TP-self-signed-\d+(?:\n\s*\+[^\n]*){26}

正如你在演示中看到的那样,它只选择了以Demo开头的26行,并用空字符串删除它们。如果您遇到任何问题,请告诉我。


1
投票

由于你没有提供你想要的结果的信息,所以无法确切地知道你是什么,所以我们只能猜测。

如果你想简单地替换它,你可以使用诸如此类的东西

+

要删除加密线之间的所有内容,请使用

from tkinter import *
import re

document_x = open('text.txt', encoding="utf8").read()

regex_test = re.sub(r".*\n*( +.*)*", "", document_x)

print(regex_test);

或者要删除加密线本身,您可以改为使用

regex_test = re.sub(r"(?:\n(?!crypto).*)*", "" , document_x)

我已经通过python 3.6.1 shell来确认它们是否正常工作。在线正则表达式测试人员虽然有用,但并不总是返回与python本身相同的结果

可能的示例答案是

regex_test = re.sub("crypto pki certificate chain TP-self-signed-[0-9]+\n", "" , 
                     document_x, re.MULTILINE)

你应该修改它以满足你的需要,这只是一个例子。鉴于你想要删除整个块,但在EG之前或之后没有任何内容

from tkinter import *
import re

document_x = open('text.csv', encoding="utf8").read()

regex_test = re.sub(r"(crypto[\s\S]*1357590403)", "", document_x)

print(regex_test);

运行上面的例子,返回删除块,留下它周围的东西,I.E。

Placeholder 1
crypto pki certificate chain TP-self-signed-1357590403
  +30820330 30820218 A0030201 02020101 300D0609 2A864886 F70D0101 05050030
  +31312F30 2D060355 04031326 494F532D 53656C66 2D536967 6E65642D 43657274
  +69666963 6174652D 31333537 35393034 3033301E 170D3139 30313234 31353436
  +34345A17 0D323030 31303130 30303030 305A3031 312F302D 06035504 03132649
  +4F532D53 656C662D 5369676E 65642D43 65727469 66696361 74652D31 33353735
  +39303430 33308201 22300D06 092A8648 86F70D01 01010500 0382010F 00308201
  +0A028201 0100E69D C133454E 401E763A 7686E453 5D58020D 0E6E122F A0F19E15
  +E0975148 666110BD C1F09B86 CB701C20 EF85E024 F759A921 D11DA10C A13BA3BD
  +20006387 917287CE EA0CFDDC 2FA5DD07 E5B200F4 108CACA1 DCEF0E4E EEE908ED
  +2ACD693B FC90A24F 9F865CB9 859FEFB0 EB8904D4 8FA83D29 E93B892F 32F3EC7D
  +EAA2850E 1793BBCE 86EA47B2 15645634 D81EA89C 1C2BC092 766DF58F 0B289A82
  +0C92E551 7AA9588E F5B41A41 6DB4C785 101E674D BBBCFB42 9F4F9A25 70389515
  +D1C07E2F 18C0557D 95283E90 3CCD2966 5EBF5668 A6B0B847 0B278906 E5BFA668
  +EFBE938A BE70C4C0 1A8D7218 71463EA5 49540A45 DF307B4C 459E657D C039BB68
  +F047B0B2 2F250203 010001A3 53305130 0F060355 1D130101 FF040530 030101FF
  +301F0603 551D2304 18301680 141FADF3 CC2C2293 810EDAA8 9E55327C D2B7D88A
  +88301D06 03551D0E 04160414 1FADF3CC 2C229381 0EDAA89E 55327CD2 B7D88A88
  +300D0609 2A864886 F70D0101 05050003 82010100 91E63F44 376F91C1 C50C08E4
  +B29B902B B1BC7831 C5607897 030835A6 108FC1F2 6F3DEE23 EF3E8FFF 81A121B5
  +26596004 F8F61DFD 1B603C5D 42D850E6 439C7CAE BFC285AE 3FD83870 125594C0
  +51EAAC09 BF42446F C6399B90 D0E10ACA B208819B 645BECE5 DBDDA9AD EBA1FCD9
  +2B14D0DE AB2AC1BF FF064076 ADBB4540 17AB77A4 C6B0DA3B 1BC0F5B8 44030E7B
  +27318CEE 14C90739 DD8684A8 9346EEC1 3F4958EF 835BA822 F58523C9 E9F83105
  +D3E68700 20DAFC5E B1B8CF5B BAC5CEB3 00321088 43125173 51FC8006 270731E6
  +0E0C6183 68BABA99 BD9F4F28 1EDA82D4 F00F1359 F30B6501 BC468C89 49111AB2
  +CBDE5A9D DB8DB33A 45FE6C96 7D49A70F 4C299618
crypto pki certificate chain TP-self-signed-1357590403
Placeholder 2
© www.soinside.com 2019 - 2024. All rights reserved.