我正在尝试在 PyTorch 中构建 ML 情感模型。
我在数据框中从 CMU-MOSEI 数据集中获取了情感标签,如下所示:
快乐 | 伤心 | 愤怒 | 惊喜 | 厌恶 | 恐惧 |
---|---|---|---|---|---|
1.33 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2.0 | 0.0 | 0.0 | 0.33 | 0.0 | 0.0 |
0.0 | 0.0 | 1.33 | 0.33 | 2.0 | 0.0 |
每种情绪都可以在
0.0 -> 3.0
之间
问题是:
0 -> 1
:from sklearn.preprocessing import minmax_scale
for emo in ['happy', 'sad', 'anger', 'surprise', 'disgust', 'fear']:
mosei[emo] = minmax_scale(mosei[emo])
这给了我即:
1.33,0.0,0.0,0.0,0.0,0.0
-> 0.44,0.0,0.0,0.0,0.0,0.0
2.0,0.0,0.0,0.33,0.0,0.0
-> 0.67,0.0,0.0,0.11,0.0,0.0
0.0,0.0,1.33,0.33,2.0,0.0
-> 0.0,0.0,0.44,0.11,0.67,0.0
但是最后一个例子
sum() > 1
softmax()
>>> F.softmax(torch.tensor([0.44,0.0,0.0,0.0,0.0,0.0]), dim=0)
tensor([0.2370, 0.1526, 0.1526, 0.1526, 0.1526, 0.1526])
>>> F.softmax(torch.tensor([0.0,0.0,0.44,0.11,0.67,0.0]), dim=0)
tensor([0.1312, 0.1312, 0.2037, 0.1464, 0.2564, 0.1312])
>>> minmax_scale([1.33,0.0,0.0,0.0,0.0,0.0])
array([1., 0., 0., 0., 0., 0.])
>>> minmax_scale([0.0,0.0,1.33,0.33,2.0,0.0])
array([0. , 0. , 0.665, 0.165, 1. , 0. ])
还是最后一个例子
sum() > 1
F.softmax(torch.tensor([0. , 0. , 0.665, 0.165, 1. , 0. ]), dim=0)
tensor([0.1131, 0.1131, 0.2199, 0.1334, 0.3074, 0.1131])
或者也许可能有不同/更好的标准化方法?
我尝试了所描述的有问题的方法。 谷歌搜索。
Softmax 通常用于 ML 中的标准化。 但是,您也可以根据您的 df 执行以下操作:
import pandas as pd
data = {
'happy': [1.33, 2.0, 0.0],
'sad': [0.0, 0.0, 0.0],
'anger': [0.0, 0.0, 1.33],
'surprise': [0.0, 0.33, 0.33],
'disgust': [0.0, 0.0, 2.0],
'fear': [0.0, 0.0, 0.0]
}
df = pd.DataFrame(data)
new_df = df.copy()
for i in df.index:
normalized_row = df.iloc[i] / df.iloc[i].sum()
new_df.iloc[i] = normalized_row
行被归一化并且总和为 1。