考虑以下示例:
import statsmodels.formula.api as smf
import random
import pandas as pd
df = pd.DataFrame({'y' : [x**2 + random.gauss(2) for x in range(10)],
'x' : [x for x in range(10)]})
model = smf.ols(data = df, formula = 'y ~ x + I(x**2) + I(x**3)').fit()
df['pred'] = model.predict(df)
df.set_index('x').plot()
如您所见,我将三次模型拟合到我的数据中,总体拟合效果非常好。但是,我想限制我的三次模型在两个特定的 x 点具有以下值:
f(0) = 10
f(8) = 60
如何在
statsmodels
或 sklearn
中做到这一点?
谢谢!
您可以使用
fit_constrained
使用 glm
。
import random
import pandas as pd
import statsmodels.formula.api as smf
df = pd.DataFrame(
{
"y" : [x ** 2 + random.gauss(2, 1) for x in range(10)],
"x" : [x for x in range(10)],
}
)
untrained_glm = smf.glm("y ~ x + I(x ** 2) + I(x ** 3)", df)
trained_glm = untrained_glm.fit_constrained(
([[1, 0, 0, 0], [1, 8, 64, 512]], [8, 60])
)
df["pred"] = trained_glm.predict(df)
结果:
>>> df
y x pred
0 0.191139 0 8.000000
1 3.225092 1 6.110541
2 5.353590 2 7.008272
3 9.367904 3 10.498092
4 16.512384 4 16.384900
5 28.742154 5 24.473595
6 36.584476 6 34.569078
7 51.006869 7 46.476246
8 66.839006 8 60.000000
9 82.163031 9 74.945239