ValueError:需要 2D 数组,却得到 1D 数组

问题描述 投票:0回答:2

我是数据科学的初学者,目前正在为 IBM 员工流失数据集构建模型。我该如何解决这个错误?

# LogisticRegression
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
from sklearn.model_selection import train_test_split

#Copy the DataFrame
df1 = df.copy()

#Convert categorical variables to numeric 
dummy_df = pd.get_dummies(df1, columns=["Attrition", "BusinessTravel", "Department", "EducationField", 
                                        "Gender", "JobRole", "OverTime", "MaritalStatus"], drop_first = True)
dummy_df = pd.concat([df1, dummy_df], axis=1)

dummy_df = dummy_df.drop(["Attrition", "BusinessTravel", "Department", "EducationField", 
                                        "Gender", "JobRole", "OverTime", "MaritalStatus"], axis=1)

dummy_df.rename({"Attrition_Yes":"Attrition", "OverTime_Yes":"OverTime"}, axis=1, inplace=True)

#Drop duplicate columns
dummy_df = dummy_df.loc[:,~dummy_df.columns.duplicated()]

X = dummy_df.drop("Attrition", axis=1).values

y = dummy_df["Attrition"].values


X_train, X_test,  y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=15, stratify=y)

logreg = LogisticRegression()
logreg.fit(X_train, y_train)

y_pred = logreg.predict(X_test)

logreg.score(y_pred, y_test)

ValueError: Expected 2D array, got 1D array instead:
python model regression data-science logistic-regression
2个回答
0
投票

您可能发送的是

pandas series
而不是
dataframe
。而不是
df['column']
发送
df[['column']]
。如果不起作用,请提供代码。


0
投票

问题在于:

X = dummy_df.drop("Attrition", axis=1).values

模型拟合和变换需要 X 的 2D 数组和 y 的 1D 数组。 提交 .values() 会将其转换为 1D。

最好离开:

X = dummy_df.drop("Attrition", axis=1)
© www.soinside.com 2019 - 2024. All rights reserved.