使用带有R的停用词“tr”分析土耳其语文本的问题

问题描述 投票:0回答:1

我在R中分析土耳其语文本。但是在使用停用词“tr”时存在问题。虽然在指示的链接中,土耳其语用“tr”表示但它仍然无法识别它。

这是错误:

错误:语言“tr”在源“雪球”中不可用。有关支持的语言的更多信息,请参阅stopwords_getlanguages

任何帮助,将不胜感激。

r text-mining
1个回答
0
投票

你快到了。你只需要更改source获取语言的地方的stopwords::stopwords

tldr:

要运行代码,您需要:

stopwords::stopwords("tr", source = "stopwords-iso")
[1] "acaba"      "acep"       "adamakıllı" "adeta"      "ait"        "altmýþ"  ... 

Explanation:

这些是默认source =“snowball”中可用的语言

stopwords::stopwords_getlanguages(source = "snowball")
[1] "da" "de" "en" "es" "fi" "fr" "hu" "ir" "it" "nl" "no" "pt" "ro" "ru" "sv"

要获得土耳其语,您只需将源更改为source = "stopwords-iso"。您可以在下面看到此来源中提供的所有停用词。

stopwords::stopwords_getlanguages(source = "stopwords-iso")
 [1] "af" "ar" "hy" "eu" "bn" "br" "bg" "ca" "zh" "hr" "cs" "da" "nl" "en" "eo" "et" "fi" "fr" "gl" "de" "el" "ha" "he" "hi" "hu" "id" "ga"
[28] "it" "ja" "ko" "ku" "la" "lt" "lv" "ms" "mr" "no" "fa" "pl" "pt" "ro" "ru" "sk" "sl" "so" "st" "es" "sw" "sv" "th" "tl" "tr" "uk" "ur"
[55] "vi" "yo" "zu"

这意味着您需要运行代码:

stopwords::stopwords("tr", source = "stopwords-iso")
[1] "acaba"      "acep"       "adamakıllı" "adeta"      "ait"        "altmýþ"  ... 
© www.soinside.com 2019 - 2024. All rights reserved.