螺旋数据分类

问题描述 投票:0回答:1

我有一个可分离的2类螺旋数据,即蓝色和红色,从原点螺旋上升。我知道 KNN 和 SVM 适合分类目的,但我想知道使用对数回归能否获得不错的分类结果?

我尝试过一些功能,例如 (r 和 theta)、(sinx、siny、r) 等。但似乎没有一个效果很好

classification logistic-regression spiral
1个回答
0
投票

我知道KNN和SVM适合分类目的,但我想知道使用对数回归能否获得不错的分类结果?

SVM 是一种线性模型,可以隐式地将特征变换到高维空间。如果你事先转换特征,然后将它们输入逻辑回归,你可以获得与 SVM 类似的结果。

下面的代码显式地将特征映射到 RBF 空间(RBF SVM 隐式地执行此操作),然后将这些转换后的特征提供给

LogisticRegression()
SVM(kernel='rbf')
的结果可供比较。

import matplotlib.pyplot as plt
import numpy as np

from sklearn.datasets import make_moons


#Create dataset
X, y = make_moons(n_samples=100, noise=0.2, random_state=0)

plt.scatter(X[:, 0], X[:, 1], c=y, cmap='seismic')
plt.gcf().set_size_inches(5, 3)

#Create RBF features
# and feed them into the logistic regression model
from sklearn.linear_model import LogisticRegression
from sklearn.kernel_approximation import RBFSampler, Nystroem
from sklearn.pipeline import Pipeline

pipeline = Pipeline([
    # ('rbf_features', RBFSampler(gamma=0.5, random_state=0)), #faster & approximate
    ('rbf_features', Nystroem(random_state=0)),
    ('logistic_regression', LogisticRegression(C=20, random_state=0))
])
#Fit logistic regression model on the new features
pipeline.fit(X, y)

#View the decision boundary
xx, yy = np.meshgrid(
    np.linspace(-2.5, 2.5, num=50),
    np.linspace(-2.5, 2.5, num=50)
)

proba_map = pipeline.predict_proba(np.column_stack([xx.ravel(), yy.ravel()]))
proba_map = proba_map[:, 1].reshape(xx.shape)

plt.contourf(xx, yy, proba_map, zorder=-1, cmap='coolwarm', alpha=0.5)
plt.xlabel('feature 0')
plt.ylabel('feature 1')
plt.title('Logistic regression fit on RBF features')
plt.colorbar(label='probability')
plt.show()

#
#SVM for comparison
#
from sklearn.svm import SVC
svc = SVC(kernel='rbf', probability=True).fit(X, y)

#Get probabilities (or decision map)
# decision_values = svc.decision_function(np.column_stack([xx.ravel(), yy.ravel()]))
proba_map = svc.predict_proba(np.column_stack([xx.ravel(), yy.ravel()]))[:, 1]
proba_map = proba_map.reshape(xx.shape)

plt.scatter(X[:, 0], X[:, 1], c=y, cmap='seismic')
plt.gcf().set_size_inches(5, 3)
plt.contourf(xx, yy, proba_map, zorder=-1, cmap='coolwarm', alpha=0.5)
plt.xlabel('feature 0')
plt.ylabel('feature 1')
plt.title('SVM with RBF kernel')
plt.colorbar(label='probability')
© www.soinside.com 2019 - 2024. All rights reserved.