为每个输出在python的一个字段中创建接收列

问题描述 投票:0回答:1

我正在使用lstm方法实施情绪分析,在此我已经完成了训练模型以及预测部分。但是我的预测出现在一列中。下面我将向您展示。

这是我的代码:

        with open('output1.json', 'w') as f:
            json.dump(new_data, f)

selection1 = new_data['selection1']
#creating empty list to be able to create a dataframe
names = []
dates = []
commentss = []
labels = []
hotelname = []
for item in selection1:
    name = item['name']
    hotelname.append(name)
    #print ('>>>>>>>>>>>>>>>>>> ', name)
    Date = item['reviews']
    for d in Date:
        names.append(name)
        #convert date from 'january 12, 2020' to 2020-01-02
        date = pd.to_datetime(d['date']).strftime("%Y-%m-%d")
        #adding date to the empty list dates[]
        dates.append(date)
        #print('>>>>>>>>>>>>>>>>>> ', date)
    CommentID = item['reviews']
    for com in CommentID:
        comment = com['review']
        lcomment = comment.lower()  # converting all to lowercase
        result = re.sub(r'\d+', '', lcomment)  # remove numbers
        results = (result.translate(
        str.maketrans('', '', string.punctuation))).strip()  # remove punctuations and white spaces
        comments = remove_stopwords(results)
        commentss.append(comment)
       # print('>>>>>>',comments)

    #add the words in comments that are already present in the keys of dictionary
        encoded_samples = [[word2id[word] for word in comments if word in word2id.keys()]]


    # Padding
        encoded_samples = keras.preprocessing.sequence.pad_sequences(encoded_samples, maxlen=max_words)

     # Make predictions
        label_probs, attentions = model_with_attentions.predict(encoded_samples)
        label_probs = {id2label[_id]: prob for (label, _id), prob in zip(label2id.items(), label_probs[0])}
        labels.append(label_probs)


#creating dataframe
dataframe={'name': names,'date': dates, 'comment': commentss, 'classification': labels}
table = pd.DataFrame(dataframe, columns=['name', 'date', 'comment', 'classification'])
json = table.to_json('hotel.json', orient='records')

这是我获得的结果:

[
  {
    "name": "Radisson Blu Azuri Resort & Spa",
    "date": "February 02, 2020",
    "comment": [
      "enjoy",
      "daily",
      "package",
      "start",
      "welcoming",
      "end",
      "recommend",
      "hotel"
    ],
    "label": {
      "joy": 0.0791392997,
      "surprise": 0.0002606699,
      "love": 0.4324670732,
      "sadness": 0.2866959572,
      "fear": 0.0002588668,
      "anger": 0.2011781186
    }
  },

您可以在此链接上找到完整的输出:https://jsonblob.com/a9b4035c-5576-11ea-afe8-1d95b3a2e3fd

是否可以将标签字段分成如下所示的单独字段?

[
  {
    "name": "Radisson Blu Azuri Resort & Spa",
    "date": "February 02, 2020",
    "comment": [
      "enjoy",
      "daily",
      "package",
      "start",
      "welcoming",
      "end",
      "recommend",
      "hotel"
    ],
      "joy": 0.0791392997,
      "surprise": 0.0002606699,
      "love": 0.4324670732,
      "sadness": 0.2866959572,
      "fear": 0.0002588668,
      "anger": 0.2011781186

  },

有人可以帮我,我该如何修改我的代码并使这成为可能,请大家向我解释。。

python pandas loops dataframe
1个回答
0
投票

如果您在生成结果之前无法执行此操作,则可以像这样轻松地操作该词典:

def move_labels_to_dict_root(result):
    labels = result["labels"]
    meta_data = result
    del meta_data["labels"]
    result = {**meta_data, **labels}
    return result

然后在列表理解中,如move_labels_to_dict_root调用[move_labels_to_dict_root(result) for result in results]

但是,我想问你为什么要这样做?

© www.soinside.com 2019 - 2024. All rights reserved.