import re
import fileinput
import re
#regex used
#result = re.split('(?<=\S)[^-][ ](?=[a-zA-Z0-9])', line)
<----这写了多行但是在很多行上都缺少一个字符并且不是很正确所以我搜索了很多并且不得不像下面的“$”那样广告:
result = re.split('(?<=\S$)[^-][ ](?=[a-zA-Z0-9])', line)
line = "WordsAreStickedTogetherHereIneedOneSpaceBetweeeThem"
result = re.split('(?<=\S$)[^-][ ](?=[a-zA-Z0-9])', line)
final_result = re.sub('dM','d M',result)
final_result = re.sub('dJ','d J',result)
for elem in final_result:
print elem
ERRROR:
$python main.py
Traceback (most recent call last):
File "main.py", line 22, in <module>
final_result = re.sub('dC','d C',result)
File "/usr/lib64/python2.7/re.py", line 155, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or buffer
如果你只需要拆分单词(一个单词是大写字母后跟小写字母),那么你可以简单地使用re.finditer
:
line = "WordsAreStickedTogetherHereINeedOneSpaceBetweeeThem"
matches = re.finditer("[A-Z][a-z]*", line)
new_line = " ".join(match.group() for match in matches)
变量new_line
包含:
>>> print(new_line)
'Words Are Sticked Together Here I Need One Space Betweee Them'