我已经使用python中的statsmodels包成功运行了OLS模型。但是,该模型将一个变量标为截距,并且未将其包括在交互结果中。具体来说,我在下面的“ Meal_Cat”类别中有5个级别,并且模型选择其中一个级别(“ Low”级别)并将其视为拦截。可以,但是问题是我无法看到它与其他类别的交互(例如“按组排序”交互)。
请参见下面的模型设置方法:
model = ols('Cost ~ C(Meal_Cat)*C(Group)*C(Region) + Age + Gender', data= Mealcat_DF).fit()
# Seeing if the overall model is significant
print(f"Overall model F({model.df_model: .0f},{model.df_resid: .0f}) = {model.fvalue: .3f}, p = {model.f_pvalue: .4f}")
model.summary()
我想知道是否有人可以建议一种将模型中所有术语包括在交互摘要中的方法。
如果您的变量已经是字符串或类别变量,则只需尝试以下操作。
import pandas as pd
import seaborn as sns
import statsmodels.formula.api as smf
df = sns.load_dataset('tips')
formula = 'tip ~ sex*smoker*day + total_bill'
model = smf.ols(formula, data=df)
results = model.fit()
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: tip R-squared: 0.485
Model: OLS Adj. R-squared: 0.449
Method: Least Squares F-statistic: 13.35
Date: Mon, 20 Jan 2020 Prob (F-statistic): 8.29e-25
Time: 14:21:24 Log-Likelihood: -344.02
No. Observations: 244 AIC: 722.0
Df Residuals: 227 BIC: 781.5
Df Model: 16
Covariance Type: nonrobust
=========================================================================================================
coef std err t P>|t| [0.025 0.975]
---------------------------------------------------------------------------------------------------------
Intercept 0.9917 0.357 2.777 0.006 0.288 1.695
sex[T.Female] -0.0731 0.506 -0.144 0.885 -1.071 0.925
smoker[T.No] -0.0427 0.398 -0.107 0.915 -0.827 0.741
day[T.Fri] -0.4549 0.487 -0.933 0.352 -1.415 0.506
day[T.Sat] -0.4662 0.381 -1.224 0.222 -1.217 0.284
day[T.Sun] -0.2880 0.423 -0.681 0.497 -1.121 0.545
sex[T.Female]:smoker[T.No] -0.1423 0.593 -0.240 0.811 -1.311 1.026
sex[T.Female]:day[T.Fri] 0.8553 0.737 1.161 0.247 -0.597 2.307
sex[T.Female]:day[T.Sat] 0.2319 0.605 0.383 0.702 -0.960 1.424
sex[T.Female]:day[T.Sun] 1.0867 0.772 1.407 0.161 -0.435 2.608
smoker[T.No]:day[T.Fri] 0.1224 0.905 0.135 0.893 -1.660 1.905
smoker[T.No]:day[T.Sat] 0.6258 0.480 1.303 0.194 -0.320 1.572
smoker[T.No]:day[T.Sun] 0.2552 0.505 0.506 0.614 -0.739 1.250
sex[T.Female]:smoker[T.No]:day[T.Fri] -0.2185 1.303 -0.168 0.867 -2.787 2.350
sex[T.Female]:smoker[T.No]:day[T.Sat] -0.4487 0.759 -0.591 0.555 -1.944 1.046
sex[T.Female]:smoker[T.No]:day[T.Sun] -0.7027 0.892 -0.788 0.431 -2.460 1.054
total_bill 0.1078 0.008 13.951 0.000 0.093 0.123
==============================================================================
Omnibus: 29.744 Durbin-Watson: 2.154
Prob(Omnibus): 0.000 Jarque-Bera (JB): 60.768
Skew: 0.616 Prob(JB): 6.38e-14
Kurtosis: 5.112 Cond. No. 629.
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.