CoreNLP情感分析Python遍历数据框

问题描述 投票:0回答:2

如何使此代码循环遍历数据框中的所有句子?

def get_sentiment(review):
    for text in review:
        senti = nlp.annotate(text,
                       properties={
                           'annotators': 'sentiment',
                           'outputFormat': 'json',
                           'timeout': 40000,
                       })

    #for i in senti["sentences"]:
        return ("{}: '{}': {} (Sentiment Value) {} (Sentiment)".format(
        s["index"],
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

执行时,以上仅返回第一行句子:下方...

"0: 'you can see everything from thousands of years in human history it was an unforgettable and wonderful trip to paris for me': 3 (Sentiment Value) Positive (Sentiment)"

我已经尝试过get_sentiment函数的几种变体,但是得到的最好结果是显示的结果。

我的数据框称为“评论”,并且只有一列(评论)。这是内容:

                                                                                                 Review
0   you can see everything from thousands of years in human history it was an unforgettable and wonderful trip to paris for me
1   buy your tickets in advance and consider investing in one of many reputable tour guides that you can find online for at least part of your visit to the louvre these 2 arrangements will absolutely maximize your time and enjoyment of th...
2   quite an interesting place and a must see for art lovers the museum is larger than i expected and has so many exhibition areas that a full day trip might be needed if one wants to visit the whole place
3   simply incredible do not forget to get a three day pass if you love architecture art and history it is a must
4   we got here about 45 minutes before opening time and we were very first in line to get into the museum make sure to buy tickets ahead of time to help get in faster this museum is massive and can easily take your entire day an incredi...
python-3.x function jupyter-notebook stanford-nlp pycorenlp
2个回答
0
投票

将您的方法get_sentiment定义如下:

def get_sentiment(review):

    senti = nlp.annotate(review, properties={'annotators': 'sentiment', 'outputFormat': 'json', 'timeout': 40000})
    print(("{}: '{}': {} (Sentiment Value) {} (Sentiment)".format(
           s["index"],
           " ".join([t["word"] for t in s["tokens"]]),
           s["sentimentValue"], s["sentiment"])))

使用pandas.DataFrame.apply()并运行:

>>> reviews.Review.apply(get_sentiment)

0
投票

return语句位于for loop内部。由于return的属性是函数在执行后立即中断,因此该函数将在第一个函数之后立即中断。

您需要做什么:

在循环开始之前添加var,在每个循环之后附加值。最后,将return var从循环中移出。

© www.soinside.com 2019 - 2024. All rights reserved.