我正在建立一个用于说话人识别的神经网络,我在尺寸方面遇到了问题,我在批处理生成器中一定做错了什么,但是我不知道该怎么办。我的步骤如下。首先,我准备标签:
labels = [] with open('filtered_files.csv', 'r') as csvfile: reader = csv.reader(csvfile) for file in reader: label = file[0] if label not in labels: labels.append(label) print(labels)
然后我声明batch_generator:
n_features = 20 max_length = 1000 n_classes = len(labels) def batch_generator(data, batch_size=16): while 1: random.shuffle(data) X, y = [], [] for i in range(batch_size): print(i) wav = data[i] waves, sr = librosa.load(wav, mono=True) print(waves) filename = wav.split('\\')[1] filename = filename.split('.')[0] + ".mp3" filename = filename.split('_', 1)[1] print(filename) with open('filtered_files.csv', 'r') as csvfile: reader = csv.reader(csvfile) for file in reader: if filename == file[1]: print(file[0]) label = file[0] break else: continue y.append(one_hot_encode(["'" + label + "'"])) mfcc = librosa.feature.mfcc(waves, sr) mfcc = np.pad(mfcc, ((0,0), (0, max_length - len(mfcc[0]))), mode='constant', constant_values=0) X.append(np.array(mfcc)) yield np.array(X), np.array(y)
最后,我有了神经网络声明,然后开始训练过程:
)“learning_rate = 0.001 batch_size = 64 n_epochs = 50 dropout = 0.5 input_shape = (n_features, max_length) steps_per_epoch = 50 model = Sequential() model.add(LSTM(256, return_sequences=True, input_shape=input_shape, dropout=dropout)) # model.add(Flatten()) # model.add(Dense(128, activation='relu')) # model.add(Dropout(dropout)) # model.add(Dense(n_classes, activation='softmax')) opt = Adam(lr=learning_rate) model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy']) model.summary() history = model.fit_generator( generator=batch_generator(X_train, batch_size), steps_per_epoch=steps_per_epoch, epochs=n_epochs, verbose=1, validation_data=batch_generator(X_val, 32), validation_steps=5, callbacks=callbacks )
我投入了很多代码,因为我不确定哪个部分可能实际上导致了错误的尺寸。第一层的格式存在以下问题:,,检查目标时出错:预期lstm_1具有形状(20,256),但数组形状为(1,76
如果取消注释第二层,则会收到:,,检查目标时出错:期望flatten_1具有2维,但数组的形状为(64,1,76
)“我正在建立一个用于说话人识别的神经网络,我在尺寸方面遇到了问题,我在批处理生成器中一定做错了什么,但是我不知道该怎么办。我的步骤如下。 ...
模型inputShape与数据集形状之间存在形状不匹配。如错误所示,数据集的形状为(1,76),而模型期望的形状为(20,256)(input_shape = (n_features, max_length)
)。