nltk中similar（）和一致性之间的差异

Question

我从this读过text1.similar（“monstrous”）和text1.concordance（“monstrous”）。

我无法得到令人满意的答案，因为python中的自然语言处理工具包的text1.concordance('monstrous')和text1.similar('monstrous')之间存在差异。

那么请你详细解释一下这个例子吗？

Answer 1

使用concordance(token)为您提供有关token参数的上下文。它会显示token出现的句子。

使用similar(token)返回与token相同的上下文中出现的单词列表。在这种情况下，上下文只是token两侧的单词。

所以，看看Moby Dick的文字（text1）。我们可以检查'monstrous'的一致性

text1.concordance('monstrous')

# returns:
Displaying 11 of 11 matches:
ong the former , one was of a most monstrous size . ... This came towards us ,
ON OF THE PSALMS . " Touching that monstrous bulk of the whale or ork we have r
ll over with a heathenish array of monstrous clubs and spears . Some were thick
d as you gazed , and wondered what monstrous cannibal and savage could ever hav
that has survived the flood ; most monstrous and most mountainous ! That Himmal
they might scout at Moby Dick as a monstrous fable , or still worse and more de
th of Radney .'" CHAPTER 55 Of the Monstrous Pictures of Whales . I shall ere l
ing Scenes . In connexion with the monstrous pictures of whales , I am strongly
ere to enter upon those still more monstrous stories of them which are to be fo
ght have been rummaged out of this monstrous cabinet there is no telling . But
of Whale - Bones ; for Whales of a monstrous size are oftentimes cast up dead u

然后我们可以得到与'monstrous'类似的上下文中出现的单词列表。第一个返回行的上下文是'most _____ size'。

text1.similar('monstrous')

# returns:
determined maddens contemptible modifies abundant tyrannical puzzled
trustworthy impalpable gamesome curious mean pitiable untoward
christian subtly passing domineering uncommon true

如果我们采用'true'这个词并检查它与text.concordance('true')的一致性，我们将回到87次使用“true”这个词的前25个。这不是非常有用，但是NLTK确实提供了一个名为common_contexts的附加方法，该方法显示何时使用单词列表共享相同的周围单词。

text1.common_contexts(['monstrous', 'true'])

# returns:
the_pictures

这个结果告诉我们短语"the monstrous pictures"和"the true pictures"都出现在Moby Dick中。

Answer 2

我将用例子解释：

text1.similar("monstrous")

将输出具有类似上下文的单词，例如word1 ______ word2。例如，它输出单词doleful。如果您运行：

text1.concordance("monstrous")

你会在比赛中看到这一行：

在洪水中幸存下来的;最滔天，最山区！那个神圣的

如果您运行：

text1.concordance("doleful")

你会在比赛中看到这一行：

迭代观点。有一个最悲伤和最嘲弄的葬礼！大海

和

text1.common_contexts(["monstrous", "doleful"])

将输出“最”和“和”的怪异和寂寞的常见词语

most_and

Answer 3

根据NLTK docs。一致性视图向我们显示给定单词的每次出现以及一些上下文。例如：

类似用于查找其他单词出现在类似的上下文范围内。例如：

Answer 4

Concordance(token)为您提供使用令牌的上下文。 Similar(token)为您提供出现在类似情境中的其他单词。

为了说明，这里有一个更一般的描述来近似它们的功能。

1）Concordance(token)：这会在你的令牌的左边和右边返回一个预定义数量的单词（让我们称这个单词集合为“Z”）。它为您的令牌在文本中出现的每个实例执行此操作。

2）similar(token)：如果在单词“Z”的单词中出现的可能性很大，则会在此处列出单词。

nltk中similar（）和一致性之间的差异

问题描述投票：3回答：4

4个回答

最新问题

nltk中similar（）和一致性之间的差异

问题描述 投票：3回答：4

4个回答

最新问题

问题描述投票：3回答：4