Javascript regex lookbehind: 无效的regexp组

问题描述 投票:0回答:1

我有下面这个小例子,使用的是regex。/-+|(?<=: ?).*(?<=: ?).*. But this leads to an infinite loop in Node 但这导致NodeChrome中的无限循环和Firefox中的 "Invalig regex group "错误。

(?<=.../-+|(?<=: ).*/gmThe : lookaround is a positive lookbehind and it is not yet supported in FireFox (see

supported environments here/-+|(?<=:).*/gm), thus, you will always get an exception until it is implemented.

The

const text = `
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
`;

const pattern = /-+|(?<=: ?).*/gm;

let res;
while((res = pattern.exec(text)) !== null)
{
    console.log(`"${res[0]}"`);
} 

pattern belongs to patterns that may match empty strings, and this is a very typical "pathological" type of patterns. The

flag makes the JS regex engine match all occurrences of the pattern, and to do that, it advances its

"-------------------------------------"
"5048603"
""
"asjhgg | a3857"
"Something..."
"-------------------------------------"
"5048603"
""
"asjhgg | a3857"
"Something..."
"-------------------------------------"
upon a valid match, but in cases when the match is of zero length, it does not, and keeps on trying the same regex at the same location all over again, and you end up in the loop. See
javascript regex regex-lookarounds
1个回答
3
投票

properly to avoid infinite loops in these cases.(?<=...)From what I see, you want to remove all beginning of lines before the first including and any whitespaces after. You may use

Or, if you want to actually extract those lines that are all /-+|(?<=: ?).*s or all after g, you may uselastIndex try to use this pattern : lastIndex

:Up front: Wiktor's answer is the answer to make it work cross-browser.:For anyone who is interested in how to get this to work in Chrome with the "original" pattern (thanks to Wiktor's answer, pointing out that the last index is not incremented on zero-matching):

text.replace(/^[^:\r\n]+:[^\S\r\n]*/gm, '')

- :A Regex lookahead is defined like this (?=pattern) and not (pattern?)

const text = `
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
`;

const pattern = /^-+$|:[^\S\r\n]*(.*)/gm;

let res;
while((res = pattern.exec(text)) !== null)
{
    if (res[1] != undefined) {
      console.log(res[1]);
    } else {
      console.log(res[0]);
    }
}
https:

0
投票

./(.*):(.*)/mg

const regex = /(.*):(.*)/mg;
const str = `-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------
Prop Name: 5048603
Prop2 Name:
Bla bla bla: asjhgg | a3857
Location: Something...
-------------------------------------`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}
如果我把regex改成

0
投票

有谁能给我解释一下这种行为,以及我必须使用什么regex来匹配以冒号结束的行?我很想知道...

EDIT:

const pattern = /-+|(?<=: ?).*/gm;

let res;
while((res = pattern.exec(text)) !== null)
{
    if(res.index === pattern.lastIndex)
        pattern.lastIndex++;
    console.log(`"${res[0]}"`);
}
预期的输出是:

© www.soinside.com 2019 - 2024. All rights reserved.