我正在尝试构建一个神经网络来近似从-50到50的数字的平方。我参考了这个答案中的代码来写我的:
import tensorflow as tf
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import regularizers
x_train = np.random.random((10000,1))*100-50
y_train = np.square(x_train)
model = Sequential(
[
Dense(8, activation = 'relu', kernel_regularizer = regularizers.l2(0.001), input_shape = (1,)),
Dense(8, activation = 'relu', kernel_regularizer = regularizers.l2(0.001)),
Dense(1, activation = 'relu')
]
)
batch_size = 32
epochs = 100
model.compile(loss = 'mse', optimizer='adam')
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose = 1)
x = "n"
while True:
print("enter num:")
x = input()
if x == "end":
break
X = int(x)
predicted_sum = model.predict(np.array([X]))
print(predicted_sum)
问题是,所有输入都会产生输出“[[0.]]”。我不知道这是为什么造成的以及如何解决它,有人可以帮忙吗?
代码执行后立即显示此消息,与问题有关吗?
oneDNN custom operations are on. You may see slightly
different numerical results due to floating-point round-off errors from different computation orders. To turn them off
, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
通过将批量大小从 32 更改为 256,将 epoch 数从 100 更改为 1000,并从最后一层删除 ReLU 激活函数,我得到了合理的结果。
ReLU 存在一个问题,如果激活的输入是 x < 0, then ReLU'(x) = 0. This means that if the output from the final layer happens to be negative, there will be no gradient to correct this.
这也是前几层中的一个问题,但单个神经元以这种方式不幸的可能性比八个神经元不幸的可能性要大得多。
作为删除它的替代方法,您还可以研究leaky ReLU。
考虑到原始代码训练了 15,000 个 epoch,100 个 epoch 得到更糟糕的结果也就不足为奇了。我还将批量大小更改为 256,目的是使代码运行得更快。