永远运行的半监督svm模型

Question

我正在试验 Elliptic 比特币数据集，并尝试检查数据集在监督和半监督模型上的性能。这是我的 supervised SVM 模型的代码：

classified = class_features_df[class_features_df['class'].isin(['1','2'])]

X = classified.drop(columns=['txId', 'class', 'time step']) 
y = classified[['class']]

# in this case, class 2 corresponds to licit transactions, we change this to 0 as our interest is the illicit transactions
y = y['class'].apply(lambda x: 0 if x == '2' else 1 )

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=15, shuffle=False)

model_svm = svm.SVC(kernel='linear') # Linear Kernel

model.fit(X_train, Y_train)

#find accuracy score
y_pred = model.predict(X_test)
acc = accuracy_score(Y_test, y_pred)

上面的代码运行良好并给出了良好的结果，但是当为 半监督学习 尝试相同的代码时，我收到警告并且我的模型已经运行了一个多小时（而它在不到一分钟的时间内运行了监督学习）

X_train_lab, X_test_unlab, y_train_lab, y_test_unlab = train_test_split(X_train, y_train, test_size=0.30, random_state=1, stratify=y_train)

unclassified = class_features_df[class_features_df['class'] == 3]

X_unclassified = unclassified[local_features_col + agg_features_col]

predictions = model_svm.predict(X_unclassified.values)


unclassified['class'] = predictions

# Combine the labeled and newly labeled unlabeled data
classified = classified.append(unclassified)


X = classified.drop(columns=['txId', 'class', 'time step'])
y = classified['class'].astype('int') # astype('int added to remove "'<' not supported between instances of 'int' and 'str' svm)" error)

model_svm.fit(X, y)

# Evaluate the model on the test set
y_pred = model_svm.predict(X_test_unlab)
acc = accuracy_score(y_test_unlab, y_pred)
print("Accuracy " , acc)

附加信息：值为 1 和 2 的类是标记的交易，值 3 的类是未标记或未分类的交易。这是数据集前 5 个值的图片：

我的半监督实施会出错吗？或者缺少任何值？任何代码帮助将不胜感激。

永远运行的半监督svm模型

问题描述投票：0回答：0

最新问题

永远运行的半监督svm模型

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0