目前我正在尝试从输入文件中删除所有形式的注释。但是,我无法弄清楚如何删除特定表单“{comment}”。我知道在这个网站上有很多正则表达式的例子来删除多行/单行注释,但我无法弄明白。
输入:
int j=100;
/* comment needs to be removed*/
int c = 200;
/*
*comment needs to be removed
*/
count = count + 1;
{comment needs to be removed}
i++;
输出:
int j=100;
int c =200;
count = count +1;
i++;
我已经能够删除前两个注释,但不能删除最后一个注释。我尝试使用"{}".*
的正则表达,但是这对我的上一次评论{comment}
不起作用。是否有正则表达式可用于纠正这一点,还是我更好地在C中创建一个函数并以这种方式处理这种情况?
==请注意,对于下面的所有正则表达式,匹配必须由$2
(捕获组2)替换,后者写回非注释。这有效地删除了所有注释==
这是一个标准的C ++注释解析器。 这是保留格式的扩展版本。
生的:
(?m)((?:(?:^[ \t]*)?(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|/\*|//)))?|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|/\*|//))|(?=\r?\n))))+)|("(?:\\[\S\s]|[^"\\])*"|'(?:\\[\S\s]|[^'\\])*'|(?:\r?\n|[\S\s])[^/"'\\\s]*)
划定/ regex /
/(?m)((?:(?:^[ \t]*)?(?:\/\*[^*]*\*+(?:[^\/*][^*]*\*+)*\/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/)))?|\/\/(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/))|(?=\r?\n))))+)|((?:"[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?:\r?\n(?:(?=(?:^[ \t]*)?(?:\/\*|\/\/))|[^\/"'\\\r\n]*))+|[^\/"'\\\r\n]+)+|[\S\s][^\/"'\\\r\n]*)/
演示PCRE:https://regex101.com/r/UldYK5/1 演示Python:https://regex101.com/r/avfSfB/1
----------------------------------------------------------
这是上面的修改版本,添加你的{ .. }
评论。
(这不推荐,因为{}
是C语言中的作用域)
生的:
(?m)((?:(?:^[ \t]*)?(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|/\*|//|\{)))?|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|/\*|//|\{))|(?=\r?\n))|\{[\S\s]*?\}(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|/\*|//|\{)))?))+)|((?:"[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?:\r?\n(?:(?=(?:^[ \t]*)?(?:/\*|//|\{))|[^/"'\\\r\n{]*))+|[^/"'\\\r\n{]+)+|[\S\s][^/"'\\\r\n{]*)
划定/ regex /
/(?m)((?:(?:^[ \t]*)?(?:\/\*[^*]*\*+(?:[^\/*][^*]*\*+)*\/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/|\{)))?|\/\/(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/|\{))|(?=\r?\n))|\{[\S\s]*?\}(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/|\{)))?))+)|((?:"[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?:\r?\n(?:(?=(?:^[ \t]*)?(?:\/\*|\/\/|\{))|[^\/"'\\\r\n{]*))+|[^\/"'\\\r\n{]+)+|[\S\s][^\/"'\\\r\n{]*)/
演示PCRE(使用示例文本):https://regex101.com/r/xHTua7/1
带注释的可读版本
(?m) # Multi-line modifier
( # (1 start), Comments
(?:
(?: ^ [ \t]* )? # <- To preserve formatting
(?:
/\* # Start /* .. */ comment
[^*]* \*+
(?: [^/*] [^*]* \*+ )*
/ # End /* .. */ comment
(?: # <- To preserve formatting
[ \t]* \r? \n
(?=
[ \t]*
(?:
\r? \n
| /\*
| //
| \{ # Added: for {} comments
)
)
)?
| # or,
// # Start // comment
(?: # Possible line-continuation
[^\\]
| \\
(?: \r? \n )?
)*?
(?: # End // comment
\r? \n
(?= # <- To preserve formatting
[ \t]*
(?:
\r? \n
| /\*
| //
| \{ # Added: for {} comments
)
)
| (?= \r? \n )
)
| # or,
\{ # Added: Start { .. } comment
[\S\s]*?
\} # Added: End { .. } comment
(?: # <- To preserve formatting
[ \t]* \r? \n
(?=
[ \t]*
(?:
\r? \n
| /\*
| //
| \{ # Added: for {} comments
)
)
)?
)
)+ # Grab multiple comment blocks if need be
) # (1 end)
| ## OR
( # (2 start), Non - comments
# Quotes
# ======================
(?: # Quote and Non-Comment blocks
"
[^"\\]* # Double quoted text
(?: \\ [\S\s] [^"\\]* )*
"
| # --------------
'
[^'\\]* # Single quoted text
(?: \\ [\S\s] [^'\\]* )*
'
| # --------------
(?: # Qualified Linebreak's
\r? \n
(?:
(?= # If comment ahead just stop
(?: ^ [ \t]* )?
(?:
/\*
| //
| \{ # Added: for {} comments
)
)
| # or,
# Added: [^{] for {} comments
[^/"'\\\r\n{]* # Chars which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
)
)+
| # --------------
# Added: [^{] for {} comments
[^/"'\\\r\n{]+ # Chars which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
)+ # Grab multiple instances
| # or,
# ======================
# Pass through
[\S\s] # Any other char
# Added: [^{] for {} comments
[^/"'\\\r\n{]* # Chars which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
) # (2 end), Non - comments