通过Python脚本使用模型

Question

首先，我对机器学习很菜鸟，我可能无法理解复杂的建议和答案，但这是针对我的大学的，我没有时间学习基础知识。

我正在尝试使用我通过 GTZAN 数据集创建的模型 - 音乐流派分类（https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification）

模型具有很高的准确性，但我没有得到令人满意的输出，我不知道需要多少信息，但我觉得我在使用模型的脚本中犯了一个错误。这是脚本。

import tensorflow as tf
import librosa
import numpy as np

# Load the pre-trained model
model = tf.keras.models.load_model('C:/Users/VOLKAN/Desktop/SonProject/model.keras')

# Define the genres (assuming you have a fixed list of genres the model predicts)
genres = ['Blues', 'Classical', 'Country', 'Disco', 'Hip-hop', 'Jazz', 'Metal', 'Pop', 'Reggae', 'Rock']  # Replace with actual genre names

def extract_features(file_path):
    y, sr = librosa.load(file_path, duration=30)
    mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=58)
    mfccs = np.mean(mfccs.T, axis=0)
    features = mfccs[np.newaxis, ...]
    return features



def predict_genre(file_path):
    features = extract_features(file_path)
    predictions = model.predict(features)
    genre_index = np.argmax(predictions, axis=1)[0]
    return genres[genre_index]

# Example usage
audio_file = 'C:/Users/VOLKAN/Desktop/Data/genres_original/classical/classical.00059.wav'  # Replace with your audio file path
predicted_genre = predict_genre(audio_file)
print(f'The predicted genre is: {predicted_genre}')

`

在模型中我使用了 cnn。模型具有 .keras 扩展名。

Answer 1

确保保存模型的文件路径 (

model.keras

) 正确，并且模型文件存在于给定位置。检查您从音频中提取的梅尔倒谱系数 (MFCC) 的数量是否为 58，这应该是模型预期的输入大小。确保模型的输出层有 10 个具有

softmax

激活的单元，对应于您要分类的 10 个流派。仔细检查传递给

predict_genre

函数的文件路径是否正确，并且该位置存在音频文件。如果您的训练数据已标准化，请将相同的标准化应用于输入音频文件以进行预测；然后在代码中的多个位置包含 print 语句，以检查特征的形状和预测，以帮助调试任何问题。

import tensorflow as tf
import librosa
import numpy as np

# Load the pre-trained model
model = tf.keras.models.load_model('C:/Users/VOLKAN/Desktop/SonProject/model.keras')

# Define the genres
genres = ['Blues', 'Classical', 'Country', 'Disco', 'Hip-hop', 'Jazz', 'Metal', 'Pop', 'Reggae', 'Rock']

def extract_features(file_path):
    y, sr = librosa.load(file_path, duration=30)
    mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=58)
    mfccs = np.mean(mfccs.T, axis=0)
    features = mfccs[np.newaxis, ...]
    return features

def predict_genre(file_path):
    features = extract_features(file_path)
    print("Shape of extracted features:", features.shape)
    predictions = model.predict(features)
    print("Shape of predictions:", predictions.shape)
    genre_index = np.argmax(predictions, axis=1)[0]
    return genres[genre_index]

# Example usage
audio_file = 'C:/Users/VOLKAN/Desktop/Data/genres_original/classical/classical.00059.wav'
predicted_genre = predict_genre(audio_file)
print(f'The predicted genre is: {predicted_genre}')

通过Python脚本使用模型

问题描述投票：0回答：1

1个回答

最新问题

通过Python脚本使用模型

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1