[使用依赖性规则匹配进行方面提取中的命名实体识别

Question

[使用Spacy，我根据我定义的语法规则从文本中提取了方面-观点对。规则基于POS标签和相关性标签，它们由token.pos_和token.dep_获得。下面是其中一种语法规则的示例。如果我通过句子Japan is cool,，它将返回[('Japan', 'cool', 0.3182)]，其中的值表示cool的极性。

但是我不知道如何使它识别命名实体。例如，如果我通过Air France is cool，我想获得[('Air France', 'cool', 0.3182)]，但当前得到的是[('France', 'cool', 0.3182)]。

我检查了Spacy在线文档，并且知道如何提取NE（doc.ents）。但是我想知道使提取器正常工作的可能解决方法。请注意，我不需要强制措施，例如连接字符串AirFrance，Air_France等。>

谢谢！

import spacy

nlp = spacy.load("en_core_web_lg-2.2.5")
review_body = "Air France is cool."
doc=nlp(review_body)

rule3_pairs = []

for token in doc:

    children = token.children
    A = "999999"
    M = "999999"
    add_neg_pfx = False

    for child in children :
        if(child.dep_ == "nsubj" and not child.is_stop): # nsubj is nominal subject
            A = child.text

        if(child.dep_ == "acomp" and not child.is_stop): # acomp is adjectival complement
            M = child.text

        # example - 'this could have been better' -> (this, not better)
        if(child.dep_ == "aux" and child.tag_ == "MD"): # MD is modal auxiliary
            neg_prefix = "not"
            add_neg_pfx = True

        if(child.dep_ == "neg"): # neg is negation
            neg_prefix = child.text
            add_neg_pfx = True

    if (add_neg_pfx and M != "999999"):
        M = neg_prefix + " " + M

    if(A != "999999" and M != "999999"):
        rule3_pairs.append((A, M, sid.polarity_scores(M)['compound']))
结果

rule3_pairs
>>> [('France', 'cool', 0.3182)]
所需的输出

rule3_pairs
>>> [('Air France', 'cool', 0.3182)]

[使用Spacy，我根据我定义的语法规则从文本中提取了方面-观点对。规则基于POS标记和依赖标记，这些标记是通过token.pos_和token.dep_获得的。 ...

Answer 1

将实体集成到提取器中非常容易。对于每对子代，都应检查“ A”子代是否是某个命名实体的头，如果为true，则将整个实体用作主题。

[使用依赖性规则匹配进行方面提取中的命名实体识别

问题描述投票：0回答：1

1个回答

最新问题

[使用依赖性规则匹配进行方面提取中的命名实体识别

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1