我对数据执行了列变换器来编码分类特征和缩放数字特征。结果将所有转换合并到 1 列中

问题描述 投票:0回答:1

所有输出转换都合并为一列: Output 我的数据框的形状是 (445132, 34) 它减少到 (445132, 1)

分类对象包含除“一般健康”之外的所有分类列 numeric 包含所有数字列

以下是我的代码:

trans = ColumnTransformer(transformers=[
    ("encoder", OrdinalEncoder(categories=[["Excellent","Very good","Good","Fair","Poor"]]), ["GeneralHealth"]),
    ("encoder1", OneHotEncoder(drop="first"), categorical),
    ("scaler", StandardScaler(), numerical)
], remainder="passthrough")

f_transformed = trans.fit_transform(f)

transformed_data = pd.DataFrame(f_transformed, columns=trans.get_feature_names_out())

transformed_data.head(4)

我做了一些补充:

trans = ColumnTransformer(transformers=[
    ("encoder", OrdinalEncoder(categories=[["Excellent","Very good","Good","Fair","Poor"]]), ["GeneralHealth"]),
    ("encoder1", OneHotEncoder(drop="first"), categorical),
    ("scaler", StandardScaler(), numerical)
], remainder="passthrough", verbose_feature_names_out=False)

f_transformed = trans.fit_transform(f)

transformed_data = pd.DataFrame(f_transformed, columns=trans.get_feature_names_out())

transformed_data.head(4)
machine-learning scikit-learn encoding scaling
1个回答
0
投票

如果您使用的是

sklearn
版本
1.2
或更新版本,请尝试以下操作:

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OrdinalEncoder, OneHotEncoder, StandardScaler

# Define the transformer
trans = ColumnTransformer(transformers=[
    ("encoder", OrdinalEncoder(categories=[["Excellent", "Very good", "Good", "Fair", "Poor"]]), ["GeneralHealth"]),
    ("encoder1", OneHotEncoder(drop="first"), categorical),
    ("scaler", StandardScaler(), numerical)
], remainder="passthrough")

# Set the output of the transformer to a pandas DataFrame
trans.set_output(transform="pandas")

# Fit and transform the data
f_transformed = trans.fit_transform(f)

# Now f_transformed should be a DataFrame with the appropriate column names
transformed_data = f_transformed

# Display the first few rows of the DataFrame
print(transformed_data.head(4))
© www.soinside.com 2019 - 2024. All rights reserved.