无法修复线性回归模型中的拟合函数

问题描述 投票:0回答:1

我尝试使用回归模型直到合适。通过 macos (M1),它可以一直工作到最后一行的

fit()

import pandas as pd
import datetime as dt
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
df=pd.read_csv('USA_Housing.csv')

column=df.columns

X=df[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
       'Avg. Area Number of Bedrooms', 'Area Population', 'Address']]
y=df['Price']

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=101)

from sklearn.linear_model import LinearRegression

lm=LinearRegression()
lm.fit(X_train,y_train) # this throws an error
---
error show
after run by PyCharm show result.

Traceback (most recent call last): File "/Users/krit/PycharmProjects/PythonRefresh/main.py", line 21, in lm.fit(X_train,y_train) 
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/base.py", line 1151, in wrapper return fit_method(estimator, *args, **kwargs) 
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/linear_model/_base.py", line 678, in fit X, y = self._validate_data( 
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/base.py", line 621, in _validate_data X, y = check_X_y(X, y, **check_params) 
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/utils/validation.py", line 1147, in check_X_y X = check_array( 
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/utils/validation.py", line 917, in check_array array = _asarray_with_order(array, order=order, dtype=dtype, xp=xp)
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/sklearn/utils/_array_api.py", line 380, in _asarray_with_order array = numpy.asarray(array, order=order, dtype=dtype) 
File"/Users/krit/PycharmProjects/PythonRefresh/venv/lib/python3.9/site-packages/pandas/core/generic.py", line 2084, in array arr = np.asarray(values, dtype=dtype) ValueError: could not convert string to float: '1836 Shaw Lane Apt. 733\nGracetown, PW 83118-5264'

我正在通过 scikit-learn 模型学习回归模型。它适用于 Windows 操作系统,但当我在 macOS 中安装 PyCharm 时。它不起作用我该如何修复它。

python macos machine-learning scikit-learn linear-regression
1个回答
0
投票

您正在尝试对字符串数据执行线性回归。对于与您的问题类似的问题,请参阅this答案。正如错误明确指出的那样 -

ValueError: could not convert string to float: '1836 Shaw Lane Apt. 733\nGracetown, PW 83118-5264'

您使用的库尝试将此字符串转换为浮点数,这是不可能的,因此是错误的原因。

解决方案

一个非常快速的修复删除所有可能包含字符串值的,例如地址。

此外,我认为不需要房子的完整地址就可以进行良好的预测。我要么删除该列,要么只使用一些像“Shaw Lane Apt”等的内容。

因此,要么删除该列,要么将其转换为数字。免费建议 - 如果您正在考虑使用地址列,请按区域对其进行分类并使用 one-hot 编码(尽管这会增加项目的复杂性)。

© www.soinside.com 2019 - 2024. All rights reserved.