如何使用正面的正则表达式前瞻匹配,但排除前瞻部分?

问题描述 投票:0回答:2

要匹配的线是

part1a_part1b__part1c_part1d_part3.extension
part1a_part1b__part1c_part1d__part3.extension
part1a_part1b__part1c_part1d_part2short_part3.extension
part1a_part1b__part1c_part1d_part2short__part3.extension
part1a_part1b__part1c_part1d_part2_part3.extension
part1a_part1b__part1c_part1d_part2__part3.extension
part1a_part1b__part1c_part1d_part2full_part3.extension
part1a_part1b__part1c_part1d_part2full__part3.extension
part1a_part1b__part1c_part1d_part2short-part3.extension
part1a_part1b__part1c_part1d_part2-part3.extension
part1a_part1b__part1c_part1d_part2full-part3.extension
part1a_part1b__part1c_part1d_part4.extension
part1a_part1b__part1c_part1d__part4.extension

除了最后两行之外,所需的匹配应该为所有上述行提供精确的part1a_part1b__part1c_part1d。也就是说,“干”具有任意数量的part1optional part2 (in limited forms),并且必须以part3.extension结尾。

现在,我只是到了

(?P<stem>[[:alnum:]_-]+)(?=(|part2short|part2|part2full))[_-]+part3\.extension

,上面的行匹配的“词干”值是

part1a_part1b__part1c_part1d
part1a_part1b__part1c_part1d_
part1a_part1b__part1c_part1d_part2short
part1a_part1b__part1c_part1d_part2short_
part1a_part1b__part1c_part1d_part2
part1a_part1b__part1c_part1d_part2_
part1a_part1b__part1c_part1d_part2full
part1a_part1b__part1c_part1d_part2full_
part1a_part1b__part1c_part1d_part2short
part1a_part1b__part1c_part1d_part2
part1a_part1b__part1c_part1d_part2full    

你有没有可以评论如何匹配除了最后两行之外的所有上述行中的part1a_part1b__part1c_part1d,如果可能的话?

regex regex-lookarounds
2个回答
1
投票

你可以使用这个正则表达式使用非贪婪的匹配,一个带有可选匹配的前瞻:

(?m)^(?P<stem>[[:alnum:]_-]+?)(?=(?:[_-]+part2(?:short|full)?)?[_-]+part3\.extension$)

RegEx Demo

(?=(?:[_-]+part2(?:short|full)?)?[_-]+part3\.extension$)是一个积极的先行者,用[-_]part3.extension和可选的[-_]part2...字符串断言行尾。


1
投票

您可以将前4个部分与文本和下划线匹配,并使用一个肯定的前瞻,断言字符串以part3.extension结尾:

^(?P<stem>[^_]+_[^_]+__[^_]+_[^_]+)(?=.*part3\.extension$)

这将匹配:

^                     # Begin of the string
(?P<stem>             # Named captured group stem
[^_]+_                # Match not _ one or more times, then _
[^_]+__               # Match not _ one or more times, then __
[^_]+_                # Match not _ one or more times, then _
[^_]+                 # # Match not _ one or more times
)                     # Close named capturing group
(?=                   # A positive lookahead that asserts what follows
  .*part3\.extension$ # Match part3.extension at the end of the string
)                     # Close lookahead
© www.soinside.com 2019 - 2024. All rights reserved.