不平衡数据集的分类

问题描述 投票:0回答:1

我有一个包含 3 个类别的数据集,这些数据集来自 40 个人。有些人有 3 类数据,有些人只有 2 类或 1 类数据。我正在尝试与一个人进行交叉验证分类。但这并没有给我带来好的结果。那么我该如何处理这种分类问题呢?

我曾尝试与一个人进行交叉验证。但它不起作用

python dataset classification cross-validation
1个回答
0
投票
from sklearn.model_selection import StratifiedKFold
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
import numpy as np

X = np.array([[...], [...], ...])  # Replace with your features
y = np.array([0, 1, 2, 0, 1, 1, ...])  # Replace with your labels

classifier = SVC(kernel='linear', C=1)
pipeline = make_pipeline(StandardScaler(), classifier)

stratified_kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
accuracy_scores = []
for train_index, test_index in stratified_kfold.split(X, y):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]


    pipeline.fit(X_train, y_train)

    
    predictions = pipeline.predict(X_test)

    accuracy = accuracy_score(y_test, predictions)
    accuracy_scores.append(accuracy)
    enter code here

print("Accuracy scores for each fold:", accuracy_scores)

print("Mean Accuracy:", np.mean(accuracy_scores))
最新问题
© www.soinside.com 2019 - 2024. All rights reserved.