如何训练 X 为 5d,y 为 2d(填充)的神经网络

问题描述 投票:0回答:0

我正在尝试训练一个模仿特定玩家下棋风格的神经网络。 问题是我无法训练模型,因为它的形状。也许我需要制作 y.shape 1D 因为 y_train 的形状应该与输出层的形状相匹配。但后来我认为该模型无法跟踪每场比赛和每一步。此外,245 和 14 维度似乎也不适合训练模型。我如何实现模型以便它可以学习,或者我需要如何修改数据才能训练模型并获得移动作为输出?我期待模型进入棋盘状态并返回 1 个移动/标签

X 形 (24242, 245, 14, 8, 8) 包含 24242 个游戏的 numpy 数组,以数字表示, 每场比赛被填充到245步, 每一步由 14 个 8x8 棋盘组成(每种类型的棋子(兵、车、骑士、主教、皇后、国王)1 个棋盘)+ 1 个棋盘,以数字表示从白棋的角度显示棋盘上被攻击的每个方块 + 同样的事情黑色的碎片)

y 形 (24242, 245) 每个游戏的 1 个列表(总共 24242 个)每个包含 245 个元素(用“0”填充)标签编码 SAN,但只有我想模仿的玩家在游戏中玩过的动作。前 3 场比赛是这样的:

(24242, 245)
[[3830 3956    0    0    0    0    0    0    0    0    0    0    0    0...
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0]
 [4026 3961    0    0    0    0    0    0    0    0    0    0    0    0...
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0]
 [3769 3711  715 3830   76 3958 3772 4024 4026 4051 3986 3873   58  225
   976 1065 1643 1702 3893   79 3917 1318  890    0    0    0    0    0 ...
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0]]
<class 'numpy.ndarray'>

型号:


def build_model(conv_size, conv_depth):
    board3d = layers.Input(shape=(max_len, 14, 8, 8))
    x = board3d
    for _ in range(conv_depth):
        x = layers.Conv2D(filters=conv_size, kernel_size=3, padding="same", activation="relu")(x)
    x = layers.Flatten()(x)
    x = layers.Dense(64, "relu")(x)
    x = layers.Dense(len(d), "sigmoid")(x) #len(d) = 4066 distinct labels
    return models.Model(inputs=board3d, outputs=x)


model = build_model(32, 4)

model.compile(optimizer=optimizers.Adam(5e-4), loss="categorical_crossentropy")
model.summary()



model.fit(X_train, y_train,
          batch_size=5,
          epochs=1000,
          verbose=1,
          validation_split=0.1,
          callbacks=[callbacks.ReduceLROnPlateau(monitor="loss", patience=10),
                     callbacks.EarlyStopping(monitor="loss", patience=15, min_delta=0.0001)])

    

输出

Model: "model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_9 (InputLayer)        [(None, 248, 14, 8, 8)]   0         
                                                                 
 conv2d_8 (Conv2D)           (None, 248, 14, 8, 32)    2336      
                                                                 
 conv2d_9 (Conv2D)           (None, 248, 14, 8, 32)    9248      
                                                                 
 conv2d_10 (Conv2D)          (None, 248, 14, 8, 32)    9248      
                                                                 
 conv2d_11 (Conv2D)          (None, 248, 14, 8, 32)    9248      
                                                                 
 flatten_2 (Flatten)         (None, 888832)            0         
                                                                 
 dense_6 (Dense)             (None, 64)                56885312  
                                                                 
 dense_7 (Dense)             (None, 4063)              264095    
                                                                 
=================================================================
Total params: 57,179,487
Trainable params: 57,179,487
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1000

InvalidArgumentError                      Traceback (most recent call last)
Cell In[151], line 23
     20 # Reshape y_train to match the output shape of the model
     21 y_train = y_train.reshape(y_train.shape[0], -1)
---> 23 model.fit(X_train, y_train,
     24           batch_size=5,
     25           epochs=1000,
     26           verbose=1,
     27           validation_split=0.1,
     28           callbacks=[callbacks.ReduceLROnPlateau(monitor="loss", patience=10),
     29                      callbacks.EarlyStopping(monitor="loss", patience=15, min_delta=0.0001)])

File ~\AppData\Roaming\Python\Python310\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~\AppData\Roaming\Python\Python310\site-packages\tensorflow\python\eager\execute.py:52, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     50 try:
     51   ctx.ensure_initialized()
---> 52   tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
...
    File "C:\Users\ffgd\AppData\Roaming\Python\Python310\site-packages\keras\backend.py", line 5633, in sparse_categorical_crossentropy
      res = tf.nn.sparse_softmax_cross_entropy_with_logits(
Node: 'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
logits and labels must have the same first dimension, got logits shape [5,4066] and labels shape [1225]
     [[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_4291]

有时我在尝试更改模型时也会收到与此错误消息类似的信息:

#ValueError:形状(无,245)和(无,4066)不兼容

keras deep-learning artificial-intelligence training-data chess
© www.soinside.com 2019 - 2024. All rights reserved.