使用Python /正则表达式优化字符串

Question

请帮我用python / regex改进这个字符串。它也有很大的空白。

/**
         * this is comment                this is comment
         * this is comment
         * <blank line>
         *      this is comment
         * this is comment
         * <blank line>
         * this is comment
         */

如何通过删除/ **来获取纯文本，*

我希望输出字符串应该是：

这是评论这是评论这是评论这是评论这是评论

Answer 1

您可以使用sub()模块中的RegEx函数来匹配不需要的字符并格式化输入字符串。这是一个概念证明，可以提供您想要的输出。你可以在这里测试一下：https://repl.it/@glhr/regex-fun

import re

inputStr = """/**
         * this is comment                this is comment
         * this is comment
         * 
         *      this is comment
         * this is comment
         * 
         * this is comment
         */"""

formattedStr = re.sub("[*/]", "", inputStr) # comments
formattedStr = re.sub("\n\s{2,}|\s{2,}", "\n", formattedStr) # extra whitespaces
formattedStr = re.sub("^\n+|\n+$|\n{2,}", "", formattedStr) # extra blank lines
print(formattedStr)

您可以在像https://regexr.com/这样的网站上试验正则表达式

Answer 2

现在很清楚OP预计六次评论this is comment，因此我建议使用这个正则表达式，

^[ /*]+\n?| {2,}(.*(\n))

并用\2\1替换它。

Demo

此外，你真的不需要三个单独的正则表达式（作为其他接受的答案）来实现这一点，而只需使用一个正则表达式就可以完成。

这是一个Python代码演示，

import re

s = '''/**
         * this is comment                this is comment
         * this is comment
         * 
         *      this is comment
         * this is comment
         * 
         * this is comment
         */'''

print(re.sub(r'(?m)^[ /*]+\n?| {2,}(.*(\n))', r'\2\1', s))

下面打印并注意我在(?m)建议的正则表达式之前使用FailSafe启用了多行模式，并且非常感谢他建议它，因为它没有其他明显的，

this is comment
this is comment
this is comment
this is comment
this is comment
this is comment

如果您需要我的答案中的任何部分的解释，请告诉我。

使用Python /正则表达式优化字符串

问题描述投票：-1回答：2

2个回答

最新问题

使用Python /正则表达式优化字符串

问题描述 投票：-1回答：2

2个回答

最新问题

问题描述投票：-1回答：2