如何从CSV文件创建目标(y)和X变量

问题描述 投票:0回答:1

我正在读取CSV文件,出于建模目的,我需要创建目标(Y)和X变量。不确定如何设置。我是编码新手,需要一些我似乎无法从Pandas文档中理解的指导。我想将Target设置为“不良指标”,将“ X”设置为所有其他列。

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import pandas as pd
project = pd.read_csv('c:/users/Brandon Thomas/Project.csv')
project=pd.DataFrame(project)
df = pd.DataFrame(project.data, columns = project.feature_names)
df["Bad Indicator"] = x.target
X = df.drop("Bad Indicator",axis=1)   #Feature Matrix
y = df["Bad Indicator"]          #Target Variable
df.head()

AttributeError Traceback(最近的呼叫持续)在1#建立资料框----> 2 df = pd.DataFrame(project.data,列= project.feature_names)3 df [“不良指标”] = x.target4 X = df.drop(“错误指示器”,轴= 1)#功能矩阵5 y = df [“不良指标”]#目标变量

~\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, 
name)
   5065             if 
self._info_axis._can_hold_identifiers_and_holds_name(name):
   5066                 return self[name]
-> 5067             return object.__getattribute__(self, name)
   5068 
   5069     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'data'
python pandas dataframe target
1个回答
0
投票

在上面的代码中,您分别创建了3个数据框。用pd.read_csv一次,用project = pd.DataFrame(project)一次,用df = pd.DataFrame(...)一次。默认情况下,pd.read_csv对象将是一个数据框。

要设置Y和X,您需要做的是:

import pandas as pd

df = pd.read_csv('c:/users/Brandon Thomas/Project.csv') # this will automatically name your columns if your csv has headers

#if your csv does not have headers:
df.columns = ['Bad Indicator', 'ColumnName1', 'ColumnName2',..]

X = df.drop("Bad Indicator",axis=1)   #Feature Matrix
Y = df["Bad Indicator"]          #Target Variable

df.head()

如果csv确实有标题,请删除df.columns行。

© www.soinside.com 2019 - 2024. All rights reserved.