错误:y无法将字符串转换为float python随机森林

问题描述 投票:1回答:1

我正在使用Python和随机森林来预测我的输入文件的第一列,我的输入文件的格式为:

T,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
N,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
N,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
N,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
N,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
N,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
N,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

这里是我完整数据的链接:https://drive.google.com/file/d/1gjKoSi4rmMYZVm31LZ2Li92HM9USlu6A/view?usp=sharing

我试图根据剩余列的值来预测第一列T或N,并且我正在使用随机森林。我收到以下错误,如何解决?这是代码:

enter image description here

import pandas as pd
import numpy as np
dataset = pd.read_csv( 'data1extended.txt', sep= ',') 
dataset.head()
row_count, column_count = dataset.shape
X = dataset.iloc[:, 1:column_count].values
y = dataset.iloc[:, 0].values

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from sklearn.ensemble import RandomForestRegressor

regressor = RandomForestRegressor(n_estimators=20, random_state=0)
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)

from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
print(accuracy_score(y_test, y_pred))
python random-forest prediction
1个回答
0
投票

尝试先将目标变量更改为数字。假设“ gold”列是您的目标,则在将数据加载到数据框后立即运行它。

dataset['gold'] = dataset['gold'].astype('category').cat.codes
© www.soinside.com 2019 - 2024. All rights reserved.