用POS标记确定句子的时间性

问题描述 投票:0回答:1

我想知道是否已经从一系列句子中执行了一项行动。例如:"I will prescribe this medication""I prescribed this medication""He had already taken the stuff""he may take the stuff later"

我正在尝试tidytext方法,并决定只是寻找过去分词与未来分词动词。然而,当我使用我所获得的动词的唯一类型的POS标签是"Verb intransitive""Verb (usu participle)""Verb (transitive)"。我怎样才能了解过去或未来的动词,或者我可以使用另一个POS标签?

我热衷于使用tidytext,因为我无法安装其他一些文本挖掘包使用的rjava

r text-mining tidytext
1个回答
1
投票

看看udpipe注释的形态特征。这些都放在注释的专长列中。您可以使用cbind_morphological将这些作为额外的列放在数据集中。所有的特征都在https://universaldependencies.org/u/feat/index.html定义你会在下面看到“我开的这种药物”这句话所规定的过去式以及所用的词语来自“他已经服用”。

library(udpipe)
x <- data.frame(doc_id = 1:4, 
                text = c("I will prescribe this medication", 
                         "I prescribed this medication", 
                         "He had already taken the stuff", 
                         "he may take the stuff later"), 
                stringsAsFactors = FALSE)
anno <- udpipe(x, "english")
anno <- cbind_morphological(anno)

anno[, c("doc_id", "token", "lemma", "feats", "morph_verbform", "morph_tense")]

 doc_id      token      lemma                                                  feats morph_verbform morph_tense
      1          I          I             Case=Nom|Number=Sing|Person=1|PronType=Prs           <NA>        <NA>
      1       will       will                                           VerbForm=Fin            Fin        <NA>
      1  prescribe  prescribe                                           VerbForm=Inf            Inf        <NA>
      1       this       this                               Number=Sing|PronType=Dem           <NA>        <NA>
      1 medication medication                                            Number=Sing           <NA>        <NA>
      2          I          I             Case=Nom|Number=Sing|Person=1|PronType=Prs           <NA>        <NA>
      2 prescribed  prescribe                       Mood=Ind|Tense=Past|VerbForm=Fin            Fin        Past
      2       this       this                               Number=Sing|PronType=Dem           <NA>        <NA>
      2 medication medication                                            Number=Sing           <NA>        <NA>
      3         He         he Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs           <NA>        <NA>
      3        had       have                       Mood=Ind|Tense=Past|VerbForm=Fin            Fin        Past
      3    already    already                                                   <NA>           <NA>        <NA>
      3      taken       take                               Tense=Past|VerbForm=Part           Part        Past
      3        the        the                              Definite=Def|PronType=Art           <NA>        <NA>
      3      stuff      stuff                                            Number=Sing           <NA>        <NA>
      4         he         he Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs           <NA>        <NA>
      4        may        may                                           VerbForm=Fin            Fin        <NA>
      4       take       take                                           VerbForm=Inf            Inf        <NA>
      4        the        the                              Definite=Def|PronType=Art           <NA>        <NA>
      4      stuff      stuff                                            Number=Sing           <NA>        <NA>
      4      later      later                                                   <NA>           <NA>        <NA>
© www.soinside.com 2019 - 2024. All rights reserved.