如何设置的句子变量NLTK

Question

我是很新，使用NLTK，并已得到了坚持。我想一个文本文件分割成单独的句子，让每个句子设置为以后使用的变量。我已在第一部分的照顾：

import nltk
from nltk.tokenize import sent_tokenize

text1 = open('/Users/joshuablew/Documents/myCorpus/version1.txt').read()

sent_tokenize(text1)

这将打印回分隔每个句子：

['Who was the 44th president of the United States?', 'Where does he live?', 'This is just a plain sentence.', 'As well as this one, just to break up the questions.', 'How many houses make up the United States Congress?', 'What are they called?', 'Again, another question breakpoint here.', 'Who is our current President?', 'Can he run for re-election?', 'Why or why not?']

在这里，我不知道怎么做才能拥有这些句子自动保存到变量做。

或者说，是有可能有索引text1[0] = 'Who was the 44th president of the United States?'和text1[1] = 'Where does he live?'等等？当文本文件的各项指标是每一个人的句子

谢谢您的帮助。

Answer 1

import nltk
from nltk.tokenize import sent_tokenize

with open('1.txt', 'r') as myfile:
    sentences=myfile.read()

number_of_sentences = sent_tokenize(sentences)

print(len(number_of_sentences))

textList = sent_tokenize(sentences)

print(textList)

如何设置的句子变量NLTK

问题描述投票：0回答：1

1个回答

最新问题

如何设置的句子变量NLTK

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1