PyTorch 情感模型标签标准化方法

问题描述 投票:0回答:1

我正在尝试在 PyTorch 中构建 ML 情感模型。

我在数据框中从 CMU-MOSEI 数据集中获取了情感标签,如下所示:

快乐 伤心 愤怒 惊喜 厌恶 恐惧
1.33 0.0 0.0 0.0 0.0 0.0
2.0 0.0 0.0 0.33 0.0 0.0
0.0 0.0 1.33 0.33 2.0 0.0

每种情绪都可以在

0.0 -> 3.0

之间

问题是:

如何对此数据执行归一化至范围
0 -> 1

1.通过以下方式标准化每列:

from sklearn.preprocessing import minmax_scale

for emo in ['happy', 'sad', 'anger', 'surprise', 'disgust', 'fear']:
    mosei[emo] = minmax_scale(mosei[emo])

这给了我即:

1.33,0.0,0.0,0.0,0.0,0.0
->
0.44,0.0,0.0,0.0,0.0,0.0

2.0,0.0,0.0,0.33,0.0,0.0
->
0.67,0.0,0.0,0.11,0.0,0.0

0.0,0.0,1.33,0.33,2.0,0.0
->
0.0,0.0,0.44,0.11,0.67,0.0

但是最后一个例子

sum() > 1

2.标准化每一列并在数据加载器中执行
softmax()

>>> F.softmax(torch.tensor([0.44,0.0,0.0,0.0,0.0,0.0]), dim=0)
tensor([0.2370, 0.1526, 0.1526, 0.1526, 0.1526, 0.1526])


>>> F.softmax(torch.tensor([0.0,0.0,0.44,0.11,0.67,0.0]), dim=0)
tensor([0.1312, 0.1312, 0.2037, 0.1464, 0.2564, 0.1312])

3.按行而不是列执行标准化

>>> minmax_scale([1.33,0.0,0.0,0.0,0.0,0.0])
array([1., 0., 0., 0., 0., 0.])


>>> minmax_scale([0.0,0.0,1.33,0.33,2.0,0.0])
array([0.   , 0.   , 0.665, 0.165, 1.   , 0.   ])

还是最后一个例子

sum() > 1

  • 也许再次使用softmax?
F.softmax(torch.tensor([0.   , 0.   , 0.665, 0.165, 1.   , 0.   ]), dim=0)
tensor([0.1131, 0.1131, 0.2199, 0.1334, 0.3074, 0.1131])

或者也许可能有不同/更好的标准化方法?

我尝试了所描述的有问题的方法。 谷歌搜索。

python machine-learning pytorch dataset
1个回答
0
投票

Softmax 通常用于 ML 中的标准化。 但是,您也可以根据您的 df 执行以下操作:

import pandas as pd

data = {
    'happy': [1.33, 2.0, 0.0],
    'sad': [0.0, 0.0, 0.0],
    'anger': [0.0, 0.0, 1.33],
    'surprise': [0.0, 0.33, 0.33],
    'disgust': [0.0, 0.0, 2.0],
    'fear': [0.0, 0.0, 0.0]
}

df = pd.DataFrame(data)

new_df = df.copy()
for i in df.index:
    normalized_row = df.iloc[i] / df.iloc[i].sum()
    new_df.iloc[i] = normalized_row

行被归一化并且总和为 1。

© www.soinside.com 2019 - 2024. All rights reserved.