如何确保捕获组超过5个字符?

问题描述 投票:1回答:1

我正在使用此代码:

(?i)(?<!see )(?<!\d)(?<!")(?<!“)ITEM.*?1A.*?\n*(?<!")(?<!“)RISK.*?FACTORS(?<!")\n*([\s\S]*?)\n*ITEM.*?1B

它正在捕获ITEM 1A. RISK FACTORSITEM 1B.之间的文本,但是我怎么只能捕获超过5个字符的捕获组?

完整字符串:

ITEM 1A.    RISK FACTORS

123

ITEM 1B.

ITEM 1A.    RISK FACTORS

In addition to other information in this Form 10-K, the following risk factors should be carefully considered in evaluating us and our business because these factors currently have a significant impact or 

ITEM 1B.

因此,所需的捕获组将是:

In addition to other information in this Form 10-K, the following risk factors should be carefully considered in evaluating us and our business because these factors currently have a significant impact or 

而不是:

123
regex
1个回答
0
投票

像这样在接近数据的地方进行计数。如果需要,可以显着缩短用[^\S\r\n]替换\h的正则表达式。组1包含修剪后的数据。

(?sm)^[^\S\r\n]*ITEM[^\S\r\n]+1A[^\S\r\n]*\.[^\S\r\n]+RISK[^\S\r\n]+FACTORS[^\S\r\n]*\r?\n\s*(\S(?:(?!^[^\S\r\n]*ITEM).){3,}?\S)\s*^[^\S\r\n]*ITEM[^\S\r\n]+1B[^\S\r\n]*\.

https://regex101.com/r/ChQseo/1

展开]

 (?sm)
 ^ [^\S\r\n]* ITEM [^\S\r\n]+ 1A [^\S\r\n]* \. 
 [^\S\r\n]+ RISK [^\S\r\n]+ FACTORS [^\S\r\n]* \r? \n 

 \s* 
 (                             # (1 start)
      \S 
      (?:
           (?! ^ [^\S\r\n]* ITEM )
           . 
      ){3,}?
      \S 
 )                             # (1 end)
 \s* 

 ^ [^\S\r\n]* ITEM [^\S\r\n]+ 1B [^\S\r\n]* \.

-1
投票

我想也许是

© www.soinside.com 2019 - 2024. All rights reserved.