python statsmodels:输出“ formula.api”与““ regression.quantile_regression”的差异]

问题描述 投票:1回答:1

对于使用statsmodelspython模数,我想知道使用statsmodels.formula.apistatsmodels.regression.quantile_regression调用同一过程的差异。特别是,我获得了参数估计的差异。

附有最小工作示例。

#%% Moduls;
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.regression.quantile_regression import QuantReg


#%% Load in sample data;
data = sm.datasets.engel.load_pandas().data

#%% smf-Version;
model1 = smf.quantreg(formula='foodexp ~ income', data=data, missing="drop")
result1 = model1.fit(q=0.5, vcov='robust', kernel='epa', bandwidth='hsheather', max_iter=1000, p_tol=1e-06)

#%% QuantReg-Version;
model2 = QuantReg \
    (
        data['foodexp'].values,
        exog            =           sm.tools.tools.add_constant(data['income']).values,
        missing         =           'drop'
    )
result2 = model2.fit \
    (
        q              =           0.5, vcov='robust', kernel='epa', bandwidth='hsheather', max_iter=1000, p_tol=1e-06
    )

#%% Compare Results;
print(result1.params[0])
print(result2.params[0])
print('Difference times 10^9:       ' + str(abs(10**9*(result1.params[0]-result2.params[0]))))

编辑:我需要编辑答案;我仍然非常感谢下面提出的解决方法,该解决方法不适用于所应用的环境;原因:我没有1个回归变量。请找到附件的修改版本。

#%% Moduls;
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.regression.quantile_regression import QuantReg


#%% Load in sample data;
data = sm.datasets.engel.load_pandas().data
data['income2'] = data['income']**2

#%% smf-Version;
model1 = smf.quantreg(formula='foodexp ~ income + income2', data=data, missing="drop")
result1 = model1.fit(q=0.5, vcov='robust', kernel='epa', bandwidth='hsheather', max_iter=1000, p_tol=1e-06)

#%% QuantReg-Version;
model2 = QuantReg \
    (
        data['foodexp'].values,
        exog            =           sm.tools.tools.add_constant(data[['income', 'income2']].values),
        missing         =           'drop'
    )
result2 = model2.fit \
    (
        q              =           0.5, vcov='robust', kernel='epa', bandwidth='hsheather', max_iter=1000, p_tol=1e-06
    )

#%% Compare Results;
print(result1.params[0])
print(result2.params[0])
print('Difference times 10^9:       ' + str(abs(10**9*(result1.params[0]-result2.params[0]))))
python statsmodels quantile-regression
1个回答
1
投票
您需要对代码进行一些小的更改。这有很大的不同
© www.soinside.com 2019 - 2024. All rights reserved.