为什么我的MultiOutputRegressor方法的平均绝对误差(MAE)显示一个值而不是三个？

Question

我有以下代码，其中我需要预测3个不同的输出，然后计算每个输出的MAE（平均绝对误差）.由于支持向量机回归不像其他模型那样支持多输出回归，如随机森林和线性回归。我发现了一个选项，可以用MultiOutputRegressor类来实现，并将其视为每个输出的独立模型。

我有以下代码，其中x是我的训练和测试的特征，y是我的目标。

1）首先我想表明，有效地，我的目标（y）有3个值。

print(X.shape, X_test.shape,y.shape,y_test.shape)

(10845, 2116) (4648, 2116) (10845, 3) (4648, 3)

2) 然后，我有以下代码来计算平均绝对误差(MAE)，以及训练一个模型并在数据集上评估它。

# Function to calculate mean absolute error
def mae(y_true, y_pred):
    return np.mean(abs(y_true - y_pred))

# Funtion to take in a model, train it and evaluate it on the test set
def fit_and_evaluate2 (model):

    # Train the model with training dataset for features (X) and target (y) 
    model.fit(X, y)

    # Make predictions for the test dataset and evaluate the predictions vs the target in the test dataset
    model_pred = model.predict(X_test)
    model_mae = mae(y_test, model_pred)

    # Return the performance metric
    return model_mae

3) 当我调用这个函数进行支持向量机回归时，输出的结果是 model_pred 实际上是3个值，但MAE model_mae 只有1个值。

svm = SVR(C = 1000, gamma = 0.1)
wrapper= MultiOutputRegressor(svm)

svm_mae = fit_and_evaluate2(wrapper)

print('Support Vector Machine Regression Performance on the test set is')
svm_mae

Support Vector Machine Regression Performance on the test set is
0.19932177495538966

我不明白为什么 model_mae 只显示一个值，因为如上图所示，我的target y 有效地有3个值和 model_pred 也显示3个值。是不是我做错了什么？我用随机森林试了一下，预测和MAE都显示3个值。

Answer 1

原因是默认的 axis=None 属于 np.mean 当没有 axis 参数被指定；从文件:

轴。无或int或tuple of ints，可选。

计算平均值的轴。默认情况下是计算扁平化数组的平均值。

因为它首先对数组进行平坦化处理（即不再有3个不同的输出），然后计算MAE，现在它是一个单一的数字。

你应该改变你的 mae 函数。

def mae(y_true, y_pred):
    return np.mean(abs(y_true - y_pred), axis=0)

让我们确认一下它是否能用一些虚拟数据来工作。

import numpy as np

# 2-output data
y_true = np.array([[0.5, 1], [-1, 1], [7, -6]])
y_pred = np.array([[0, 2], [-1, 2], [8, -5]])
mae(y_true, y_pred)
# array([0.5, 1. ])

即一个2值的MAE输出，按照要求。

实际上，我们可以使用scikit-learn的 mean_absolute_error 加之适当的论据 multioutput='raw_values' (文件):

from sklearn.metrics import mean_absolute_error
mean_absolute_error(y_true, y_pred, multioutput='raw_values')
# array([0.5, 1. ])

可以说，既然你已经在使用scikit-learn，你最好利用MAE的现有功能，而不是使用你自己的功能。

为什么我的MultiOutputRegressor方法的平均绝对误差(MAE)显示一个值而不是三个？

问题描述投票：0回答：1

1个回答

最新问题

为什么我的MultiOutputRegressor方法的平均绝对误差(MAE)显示一个值而不是三个？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1