如何在nlp中解析时检索子树

问题描述 投票:1回答:1

我想在解析句子时检索子发辫,如下所示:

sentence = "All new medications must undergo testing before they can be 
             prescribed"
parser = stanford.StanfordParser()
tree_parse = parser.raw_parse(sentence)
for i, sub_tree in enumerate(tree_parse[0].subtrees()):
   if sub_tree.label() in ["S"]:
      sub_list = sub_tree
      print(sub_list)

我期待的是单独访问标记为“S”的子树,如下所示:

first subtree

(S
  (NP (DT All) (JJ new) (NNS medications))
  (VP
    (MD must)
    (VP
      (VB undergo)

second subtree

(S
    (VP
      (VBG testing)
      (SBAR
        (IN before)

3rd subtree

(S
          (NP (PRP they))
          (VP (MD can) (VP (VB be) (VP (VBN prescribed)))))))))))

但实际输出如下:

 (NP (DT All) (JJ new) (NNS medications))
  (VP
  (MD must)
  (VP
    (VB undergo)
    (S
      (VP
        (VBG testing)
        (SBAR
          (IN before)
          (S
            (NP (PRP they))
            (VP (MD can) (VP (VB be) (VP (VBN prescribed))))))))))
 How to access the sub tress individually like accessing items in a list?
python parsing nlp nltk stanford-nlp
1个回答
1
投票

您已经获得了子树:子树包含其根目录下的所有内容,因此您显示的输出被正确检索为顶级S下方的“子树”。然后你的遗嘱输出主导“在他们可以被处方之前进行测试”的子树,最后输出最低的S,主导“他们可以被处方”。

顺便提一下,您可以通过指定S直接获取filter子树:

for sub_tree in tree_parse[0].subtrees(lambda t: t.label() == "S"):
    print(sub_tree)
© www.soinside.com 2019 - 2024. All rights reserved.