我正在尝试编写代码来分割一个没有标点符号的句子。例如,如果用户输入
"Hello, how are you?"
,我可以将句子拆分为['hello','how','are','you']
userinput = str(raw_input("Enter your sentence: "))
def sentence_split(sentence):
result = []
current_word = ""
for letter in sentence:
if letter.isalnum():
current_word += letter
else: ## this is a symbol or punctuation, e.g. reach end of a word
if current_word:
result.append(current_word)
current_word = "" ## reinitialise for creating a new word
return result
print "Split of your sentence:", sentence_split(userinput)
到目前为止,我的代码可以工作,但是如果我输入一个句子而不以标点符号结尾,最后一个单词将不会显示在结果中,例如,如果输入是
"Hello, how are you"
,则结果将是['hello','how','are']
,我想这是因为没有标点符号来告诉代码字符串已经结束,有没有办法让程序检测到它是字符串的结尾?这样,即使输入是 "Hello, how are you"
,结果仍然是 ['hello','how','are','you']
。
我自己没有尝试调整你的算法,但我认为下面的方法应该可以达到你想要的效果。
def sentence_split(sentence):
new_sentence = sentence[:]
for letter in sentence:
if not letter.isalnum():
new_sentence = new_sentence.replace(letter, ' ')
return new_sentence.split()
现在正在运行:
runfile(r'C:\用户