如何根据上下文找到句子中的名词?我正在使用
nltk
库,如下所示:
text = 'I bought a vintage car.'
text = nltk.word_tokenize(text)
result = nltk.pos_tag(text)
result = [i for i in result if i[1] == 'NN']
#result = [('vintage', 'NN'), ('car', 'NN')]
此脚本的问题在于,它将
vintage
视为名词,这可能是正确的,但根据上下文,它是一个形容词。
我们怎样才能完成这个任务?
附录: 使用
textblob
,我们得到“老式汽车”作为名词:
!python -m textblob.download_corpora
from textblob import TextBlob
txt = "I bought a vintage car."
blob = TextBlob(txt)
print(blob.noun_phrases) #['vintage car']
使用 spacy 可能会解决你的任务。试试这个:
import spacy
nlp = spacy.load("en_core_web_lg")
def analyze(text):
doc = nlp(text)
for token in doc:
print(token.text, token.pos_)
analyze("I bought a vintage car.")
print()
analyze("This old wine is a vintage.")
输出
I PRON
bought VERB
a DET
vintage ADJ
car NOUN
. PUNCT
This DET
old ADJ
wine NOUN
is AUX
a DET
vintage NOUN
. PUNCT