假设我有两个数值转换和两个分类转换,并将这些转换应用于不同的列 num_col1、num_col2、cat_col1 和 cat_col2。
num1_pre = Pipeline([ ... ])
num2_pre = Pipeline([ ... ])
cat1_pre = Pipeline([ ... ])
cat2_pre = Pipeline([ ... ])
preprocessor = ColumnTransformer([
("num1_pre", num1_pre, num_col1),
("num2_pre", num2_pre, num_col2),
("cat1_pre", cat1_pre, cat_col1),
("cat2_pre", cat2_pre, cat_col2)
])
现在假设在转换之后我希望对转换后的 num_col1 和 num_col2 组合使用
PolynomialFeatures()
。我该如何继续?我知道如果没有通过类似的方法对分类进行转换,如何做到这一点
preprocessing = ColumnTransformer([
('num1_pre', num1_pre, num_col1),
('num2_pre', num2_pre, num_col2)
])
full_pipeline = Pipeline([
('preprocessing', preprocessing),
('poly', PolynomialFeatures())
])
model_pipeline = Pipeline([
('full_preprocessing', full_pipeline),
('model', LinearRegression())
])
问题是我不知道如何在存在分类转换的情况下做到这一点。谢谢!
poly_features = ColumnTransformer([
("poly", PolynomialFeatures(), [num_col_1, num_col_2])],
remainder="passthrough")
full_pipeline = Pipeline([
('preprocessing', preprocessing),
('poly_features', poly_features),
])