我可以在python中使用spacy来查找特定邻居的NP吗?我希望我的文本中的名词短语在其之前和之后都有动词。
>>> import spacy
>>> nlp = spacy.load('en')
>>> sent = u'run python program run, to make this work'
>>> parsed = nlp(sent)
>>> list(parsed.noun_chunks)
[python program]
>>> for noun_phrase in list(parsed.noun_chunks):
... noun_phrase.merge(noun_phrase.root.tag_, noun_phrase.root.lemma_, noun_phrase.root.ent_type_)
...
python program
>>> [(token.text,token.pos_) for token in parsed]
[(u'run', u'VERB'), (u'python program', u'NOUN'), (u'run', u'VERB'), (u',', u'PUNCT'), (u'to', u'PART'), (u'make', u'VERB'), (u'this', u'DET'), (u'work', u'NOUN')]
来自https://spacy.io/usage/linguistic-features#dependency-parse
你可以使用Noun chunks
。名词块是“基础名词短语” - 以名词为首的扁平短语。您可以将名词块视为名词加上描述名词的词汇 - 例如,“奢华的绿草”或“世界上最大的科技基金”。要获取文档中的名词块,只需遍历Doc.noun_chunks
即可。
In:
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp(u"Autonomous cars shift insurance liability toward manufacturers")
for chunk in doc.noun_chunks:
print(chunk.text)
Out:
Autonomous cars
insurance liability
manufacturers