我想知道是否已经从一系列句子中执行了一项行动。例如:"I will prescribe this medication"
与"I prescribed this medication"
或"He had already taken the stuff"
与"he may take the stuff later"
我正在尝试tidytext
方法,并决定只是寻找过去分词与未来分词动词。然而,当我使用我所获得的动词的唯一类型的POS标签是"Verb intransitive"
,"Verb (usu participle)"
和"Verb (transitive)"
。我怎样才能了解过去或未来的动词,或者我可以使用另一个POS标签?
我热衷于使用tidytext
,因为我无法安装其他一些文本挖掘包使用的rjava
。
看看udpipe
注释的形态特征。这些都放在注释的专长列中。您可以使用cbind_morphological
将这些作为额外的列放在数据集中。所有的特征都在https://universaldependencies.org/u/feat/index.html定义你会在下面看到“我开的这种药物”这句话所规定的过去式以及所用的词语来自“他已经服用”。
library(udpipe)
x <- data.frame(doc_id = 1:4,
text = c("I will prescribe this medication",
"I prescribed this medication",
"He had already taken the stuff",
"he may take the stuff later"),
stringsAsFactors = FALSE)
anno <- udpipe(x, "english")
anno <- cbind_morphological(anno)
anno[, c("doc_id", "token", "lemma", "feats", "morph_verbform", "morph_tense")]
doc_id token lemma feats morph_verbform morph_tense
1 I I Case=Nom|Number=Sing|Person=1|PronType=Prs <NA> <NA>
1 will will VerbForm=Fin Fin <NA>
1 prescribe prescribe VerbForm=Inf Inf <NA>
1 this this Number=Sing|PronType=Dem <NA> <NA>
1 medication medication Number=Sing <NA> <NA>
2 I I Case=Nom|Number=Sing|Person=1|PronType=Prs <NA> <NA>
2 prescribed prescribe Mood=Ind|Tense=Past|VerbForm=Fin Fin Past
2 this this Number=Sing|PronType=Dem <NA> <NA>
2 medication medication Number=Sing <NA> <NA>
3 He he Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs <NA> <NA>
3 had have Mood=Ind|Tense=Past|VerbForm=Fin Fin Past
3 already already <NA> <NA> <NA>
3 taken take Tense=Past|VerbForm=Part Part Past
3 the the Definite=Def|PronType=Art <NA> <NA>
3 stuff stuff Number=Sing <NA> <NA>
4 he he Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs <NA> <NA>
4 may may VerbForm=Fin Fin <NA>
4 take take VerbForm=Inf Inf <NA>
4 the the Definite=Def|PronType=Art <NA> <NA>
4 stuff stuff Number=Sing <NA> <NA>
4 later later <NA> <NA> <NA>