我需要使用Scipy的curve_fit操作来预测/预测/推断超过2001-01-15
的值。如何预测过去的2001-01-15
并进入2001-01-20
?
import pandas as pd
import numpy as np
from datetime import timedelta
from scipy.optimize import curve_fit
def hyperbolic_equation(t, qi, b, di):
return qi/((1.0+b*di*t)**(1.0/b))
df1 = pd.DataFrame({
'date': ['2001-01-01','2001-01-02','2001-01-03', '2001-01-04', '2001-01-05',
'2001-01-06','2001-01-07','2001-01-08', '2001-01-09', '2001-01-10',
'2001-01-11','2001-01-12','2001-01-13', '2001-01-14', '2001-01-15'],
'cumsum_days': [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],
'prod': [800, 900, 1200, 700, 600,
550, 500, 650, 625, 600,
550, 525, 500, 400, 350]})
df1['date'] = pd.to_datetime(df1['date'])
qi = max(df1['prod'])
#Hyperbolic curve fit the data to get best fit equation
popt_hyp, pcov_hyp = curve_fit(hyperbolic_equation, df1['cumsum_days'], df1['prod'],bounds=(0, [qi,1,20]))
#Adding in predicted values back into df1
df1.loc[:,'Hyperbolic_Predicted'] = hyperbolic_equation(df1['cumsum_days'], *popt_hyp)
在这里我创建将来的日期df(测试集)
df1['future_date'] = df1['date']
ftr = (df1['future_date'] + pd.Timedelta(5, unit='days')).to_frame()
#Constructs empty columns for ftr dataframe
for col in df1.columns:
if col not in ftr.columns:
ftr[col] = None
#Subset future dataframe to predict on (test set)
ftr = ftr[(ftr['future_date'] > max(df1['date']))]
ftr['cumsum_days'] = [16,17,18,19,20]
此片段将把将来的数据集与原始数据集连接起来(如果需要)
df1 = pd.concat([df1, ftr], ignore_index=True)
print(df1)
Hyperbolic_Predicted cumsum_days date future_date prod
0 931.054472 1 2001-01-01 2001-01-01 800
...
14 409.462743 15 2001-01-15 2001-01-15 350
15 NaN 16 NaT 2001-01-16 None
16 NaN 17 NaT 2001-01-17 None
17 NaN 18 NaT 2001-01-18 None
18 NaN 19 NaT 2001-01-19 None
19 NaN 20 NaT 2001-01-20 None
一旦重新运行curve_fit
操作,我将收到错误消息。如何预测过去的2001-01-15
并进入2001-01-20
?
popt_hyp, pcov_hyp = curve_fit(hyperbolic_equation, df1['cumsum_days'], df1['prod'],bounds=(0, [qi,1,20]))
错误:
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
[Scipy Curve_Fit
不为新数据提供predict
函数,而是返回函数系数和系数的协方差
获得的系数在popt_hyp
中:[9.93612473e+02 2.28621390e-01 6.55150136e-02]
。
并且系数的协方差是:
[[2.67920219e+04 2.62422207e+02 9.08459603e+00]
[2.62422207e+02 4.31869797e+00 1.24995934e-01]
[9.08459603e+00 1.24995934e-01 3.90417402e-03]]
出于您的目的,您需要使用返回的popt_hyp
重新创建该功能。您尝试估算系数的函数是:
def hyperbolic_equation(t, qi, b, di):
return qi/((1.0+b*di*t)**(1.0/b))
这里t
是传递值,因此从curve_fit估计的函数为:
def fitted_hyperbolic_equation(t):
return popt_hyp[0]/((1.0+popt_hyp[1]*popt_hyp[2]*t)**(1.0/popt_hyp[1]))
然后使用此功能预测新数据。