我有这个Imputed DataFrame:
Imputed_Df.head():
Atmospheric_Pressure Global_Radiation Net_Radiation Precipitation Relative_Humidity Temperature Wind_Direction Wind_Speed
Time
2013-11-01 01:00:00 999.451 207.75 99.09 4.450000 39.958667 13.600000 117.231667 2.138500
2013-11-01 05:00:00 992.760 167.77 85.16 5.746667 56.107500 11.900000 244.410000 2.313000
2013-11-01 09:00:00 990.272 157.00 95.04 6.271000 37.113333 12.802083 297.131500 3.270350
2013-11-01 10:00:00 998.367 191.26 82.32 4.428000 37.946500 13.800000 143.103333 2.232500
而我想要做的基本上是平滑所有列,而不是将新的平滑列添加到此DataFrame,所以这是我如何尝试这样做:
import statsmodels
from statsmodels.tsa.api import ExponentialSmoothing, SimpleExpSmoothing, Holt
def Smoothing(Col):
for Col in Imputed_Df.columns:
fit = SimpleExpSmoothing(Imputed_Df[Col]).fit(smoothing_level=0.2, optimized=False)
fcast = fit.predict(start=Imputed_Df.index.min(), end=Imputed_Df.index.max())
return fcast
Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']] = Imputed_Df.apply(Smoothing, axis=1)
但我得到了这个错误:
Columns must be same length as key
任何建议都会得到很多赞赏,谢谢你。
假设你的Smoothing Function
是Correct
print(Imputed_Df.apply(Smoothing, axis=1))
并检查你的count
返回列的df
它应该匹配8
,因为你正在服用Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']]
如果输出df列数不是8则尝试
import statsmodels
from statsmodels.tsa.holtwinters import SimpleExpSmoothing
def Smoothing(Imputed_Df):
my_df = Imputed_Df.copy()
for Col in Imputed_Df.columns:
fit = SimpleExpSmoothing(Imputed_Df[Col]).fit(smoothing_level=0.2, optimized=False)
my_df[Col] = fit.predict(start=Imputed_Df.index.min(), end=Imputed_Df.index.max())
return my_df
其实我发现时间序列数据是不规则的。
Imputed_Df = Imputed_Df.resample('H').pad() ##
Imputed_Df[['col1', 'col2', 'col3','col4','col5' , 'col6' , 'col7' , 'col8']] = Smoothing(Imputed_Df)
我更喜欢这样写作
Imputed_Df[Imputed_Df.columns + "_SES"]= Smoothing(Imputed_Df)