如何将分类数据放入垃圾箱

问题描述 投票:0回答:2

我有以下分类数据:

['Self employed', 'Government Dependent',
 'Formally employed Private', 'Informally employed',
 'Formally employed Government', 'Farming and Fishing',
 'Remittance Dependent', 'Other Income',
 'Don't Know/Refuse to answer', 'No Income']

我如何将它们放入例如这样的垃圾箱中:

 ['Government Dependent','Formally employed Government','Formally 
  employed Private'] = 0

 ['Remittance Dependent', 'Informally employed','Self employed','Other Income'] = 1
 ['Dont Know/Refuse to answer', 'No Income','Farming and Fishing'] = 2

我已经知道将数值数据放入分类箱...可以相反吗?

TRAIN = pd.read_csv("Train_v2.csv")
TRAIN['job_type'].unique()
output:
array(['Self employed', 'Government Dependent',
       'Formally employed Private', 'Informally employed',
       'Formally employed Government', 'Farming and Fishing',
       'Remittance Dependent', 'Other Income',
       'Dont Know/Refuse to answer', 'No Income'], dtype=object)
python pandas machine-learning categorical-data
2个回答
0
投票

首先创建字典,通过交换进行更改,最后使用Series.map


0
投票

如果不属于类别0或1或2,则可以执行numpy.select并将m1 = TRAIN['job_type'].isin(['Government Dependent','Formally employed Government','Formally employed Private']) m2 = TRAIN['job_type'].isin(['Remittance Dependent', 'Informally employed']) m3 = TRAIN['job_type'].isin(["Don't Know/Refuse to answer", 'No Income']) TRAIN['new'] = np.select([m1, m2, m3], [0, 1, 2], np.nan) 设为值。np.where np.nan上的更多资源:

© www.soinside.com 2019 - 2024. All rights reserved.