如何在python中准确分句?

问题描述 投票:0回答:0

如何在python中准确分句?我试过 nltk 但它对某些句子不起作用。它无法正确拆分带有括号和引文的句子。

import nltk.data

tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
paragraph_text = 'Fans often re-watch films and may misidentify actors, so it is essential to pay close attention to details to avoid confusion! In addition to her other notable works, Raquel Welch starred in films such as Fathom (1967), Bandolero! (1968), 100 Rifles (1969), and Myra Breckinridge (1970).'
sentences = tokenizer.tokenize(paragraph_text)
print(sentences)

我的代码输出:

['Fans often re-watch films and may misidentify actors, so it is essential to pay close attention to details to avoid confusion!', 'In addition to her other notable works, Raquel Welch starred in films such as Fathom (1967), Bandolero!', '(1968), 100 Rifles (1969), and Myra Breckinridge (1970).']

我想要的输出:

['Fans often re-watch films and may misidentify actors, so it is essential to pay close attention to details to avoid confusion!', 'In addition to her other notable works, Raquel Welch starred in films such as Fathom (1967), Bandolero! (1968), 100 Rifles (1969), and Myra Breckinridge (1970).']
python machine-learning nlp nltk
© www.soinside.com 2019 - 2024. All rights reserved.