为什么训练时我的数据集减少?

问题描述 投票:0回答:1

[在培训期间,我的数据集正在减少。我不知道是什么原因造成的。香港专业教育学院填补X和使用测试火车拆分

max_features = 4500
X = pad_sequences(sequences = X, maxlen = max_features, padding = 'pre')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 101)

X_train.shape

(17983, 4500)

y_train.shape

(17983,)

这是我的lstm算法

lstm_model = Sequential(name = 'lstm_nn_model')
lstm_model.add(layer = Embedding(input_dim = max_features, output_dim = 120, name = '1st_layer'))
lstm_model.add(layer = LSTM(units = 120, dropout = 0.2, recurrent_dropout = 0, name = '2nd_layer'))
lstm_model.add(layer = Dropout(rate = 0.5, name = '3rd_layer'))
lstm_model.add(layer = Dense(units = 120,  activation = 'relu', name = '4th_layer'))
lstm_model.add(layer = Dropout(rate = 0.5, name = '5th_layer'))
lstm_model.add(layer = Dense(units = len(set(y)),  activation = 'sigmoid', name = 'output_layer'))
lstm_model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])


lstm_model_fit = lstm_model.fit(X_train, y_train, epochs = 2)

enter image description here当纪元开始运行时,时间是1/17983。现在,当我重新运行时,它是1/562。请注意,我是新手,我只是运行一个已有的代码来学习。为什么会这样。

python tensorflow keras lstm training-data
1个回答
1
投票

当您用数据拟合模型时,GPU必须加载所有数据并对其进行处理。如果GPU同时加载17983数据,则会耗尽内存。因此,将数据放入“批处理”中,这些批处理是一起处理的数据组。标准尺寸为32,如果您使17983/32 = 561.96875且已四舍五入。

© www.soinside.com 2019 - 2024. All rights reserved.