softmax函数的实现为高输入返回nan

Question

我试图在cnn的末尾实现softmax，我得到的输出是nan和0。我给softmax大约10-20k的高输入值我给了一组X=[2345,3456,6543,-6789,-9234]

我的功能是

def softmax (X):
    B=np.exp(X)
    C=np.sum(np.exp(X))
    return B/C

我得到了true divide and run time error的错误

C:\Anaconda\envs\deep_learning\lib\site-packages\ipykernel_launcher.py:4: RuntimeWarning: invalid value encountered in true_divide
  after removing the cwd from sys.path.

Answer 1

根据softmax function，您需要迭代数组中的所有元素并计算每个单独元素的指数，然后将其除以所有元素的指数之和：

import numpy as np

a = [1,3,5]
for i in a:
    print np.exp(i)/np.sum(np.exp(a))

0.015876239976466765
0.11731042782619837
0.8668133321973349

但是，如果数字太大，指数可能会爆炸（计算机无法处理如此大的数字）：

a = [2345,3456,6543]
for i in a:
    print np.exp(i)/np.sum(np.exp(a))

__main__:2: RuntimeWarning: invalid value encountered in double_scalars
nan
nan
nan

为避免这种情况，首先将数组中的最高值移至零。然后计算softmax。例如，要计算[1, 3, 5]的softmax，请使用[1-5, 3-5, 5-5] [-4, -2, 0]。您也可以选择以矢量化方式实现它（正如您有意做的那样）：

def softmax(x):
    f = np.exp(x - np.max(x))  # shift values
    return f / f.sum(axis=0)

softmax([1,3,5])
# prints: array([0.01587624, 0.11731043, 0.86681333])

softmax([2345,3456,6543,-6789,-9234])
# prints: array([0., 0., 1., 0., 0.])

有关详细信息，请查看cs231n课程页面。实际问题：数字稳定性。标题正是我想要解释的。

Answer 2

如果在大数字上应用softmax，您可以尝试使用max normalization：

import numpy as np

def softmax (x):
    B=np.exp(x)
    C=np.sum(np.exp(x))
    return B/C

arr = np.array([1,2,3,4,5])

softmax(arr)
# array([0.01165623, 0.03168492, 0.08612854, 0.23412166, 0.63640865])

softmax(arr - max(arr))
# array([0.01165623, 0.03168492, 0.08612854, 0.23412166, 0.63640865])

如您所见，这不会影响softmax的结果。在你的softmax上应用这个：

def softmax(x):
    B = np.exp(x - max(x))
    C = np.sum(B)
    return B/C
op_arr = np.array([2345,3456,6543,-6789,-9234])
softmax(op_arr)
# array([0., 0., 1., 0., 0.])

Answer 3

当我运行相同的代码时，我得到：

RuntimeWarning: overflow encountered in exp
RuntimeWarning: overflow encountered in exp
RuntimeWarning: invalid value encountered in true_divide

这并不奇怪，因为e^(6543)围绕0.39 * 10^2842可能导致以下操作溢出。

要做的事情：在将数据赋予softmax之前对数据进行标准化：在将数据赋予softmax之前，可以将其除以1000，这样，在[ - 0000,20000]中输入时，您将有一个输入作为浮点数[-20] ，20]。

softmax函数的实现为高输入返回nan

问题描述投票：3回答：3

3个回答

最新问题

softmax函数的实现为高输入返回nan

问题描述 投票：3回答：3

3个回答

最新问题

问题描述投票：3回答：3