NLTK情绪维达:订购结果

问题描述 投票:0回答:3

我刚刚在我的数据集上运行Vader情绪分析:

from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
for sentence in filtered_lines2:
    print(sentence)
    ss = sid.polarity_scores(sentence)
    for k in sorted(ss):
        print('{0}: {1}, '.format(k, ss[k]), )
        print()

这是我的结果示例:

Are these guests on Samsung and Google event mostly Chinese Wow Theyre
boring 

Google Samsung 

('compound: 0.3612, ',)

()

('neg: 0.12, ',)

()


('neu: 0.681, ',)


()


('pos: 0.199, ',)


()

 Adobe lose 135bn to piracy Report 


('compound: -0.4019, ',)


()


('neg: 0.31, ',)


()


('neu: 0.69, ',)


()


('pos: 0.0, ',)


()

Samsung Galaxy Nexus announced

('compound: 0.0, ',)

()

('neg: 0.0, ',)

()

('neu: 1.0, ',)

()

('pos: 0.0, ',)

()

我想知道“复合”有多少次,大于或小于零。

我知道这可能很容易,但我对Python和编码一般都是新手。我已经尝试了很多不同的方法来创建我需要的东西,但我找不到任何解决方案。

(如果“结果样本”不正确,请编辑我的问题,因为我不知道写它的正确方法)

python python-3.x nltk
3个回答
1
投票

到目前为止,并不是最狡猾的方式,但我认为如果你没有太多的python经验,这将是最容易理解的。基本上,您创建一个包含0值的字典,并在每个案例中递增值。

from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
res = {"greater":0,"less":0,"equal":0}
for sentence in filtered_lines2:
    ss = sid.polarity_scores(sentence)
    if ss["compound"] == 0.0:
        res["equal"] +=1
    elif ss["compound"] > 0.0:
        res["greater"] +=1
    else:
        res["less"] +=1
print(res)

1
投票

您可以为每个类使用一个简单的计数器:

positive, negative, neutral = 0, 0, 0

然后,在句子循环内,测试复合值并增加相应的计数器:

    ...
    if ss['compound'] > 0:
        positive += 1
    elif ss['compound'] == 0:
        neutral += 1
    elif ...

等等


0
投票

我可能会定义一个函数来返回由文档表示的不等式的类型:

def inequality_type(val):
  if val == 0.0: 
      return "equal"
  elif val > 0.0: 
      return "greater"
  return "less"

然后在所有句子的复合分数上使用它来增加相应不等式类型的计数。

from collections import defaultdict

def count_sentiments(sentences):
    # Create a dictionary with values defaulted to 0
    counts = defaultdict(int)

    # Create a polarity score for each sentence
    for score in map(sid.polarity_scores, sentences):
        # Increment the dictionary entry for that inequality type
        counts[inequality_type(score["compound"])] += 1

    return counts

然后,您可以在过滤后的线路上调用它。

但是,只需使用collections.Counter就可以避免这种情况:

from collections import Counter

def count_sentiments(sentences):
    # Count the inequality type for each score in the sentences' polarity scores
    return Counter((inequality_type(score["compound"]) for score in map(sid.polarity_scores, sentences)))
© www.soinside.com 2019 - 2024. All rights reserved.