我有一个包含访问者和天气变量的数据集。我正在尝试根据天气变量来预测访客。由于数据集仅由季节中的访客组成,因此每年缺少值和差距。当在SAS中运行proc reg时,一切正常,但是当我使用proc VARMAX时出现了问题。由于缺少值,我无法运行回归。我该如何解决?
proc varmax data=tivoli4 printall plots=forecast(all);
id obs interval=day;
model lvisitors = rain sunshine averagetemp
dfebruary dmarch dmay djune djuly daugust doctober dnovember ddecember
dwednesday dthursday dfriday dsaturday dsunday
d_24Dec2016 d_05Dec2013 d_24Dec2017 d_24Dec2014 d_24Dec2015 d_24Dec2019
d_24Dec2018 d_24Sep2012 d_06Jul2015
d_08feb2019 d_16oct2014 d_15oct2019 d_20oct2016 d_15oct2015 d_22sep2017 d_08jul2015
d_20Sep2019 d_08jul2016 d_16oct2013 d_01aug2012 d_18oct2012 d_23dec2012 d_30nov2013 d_20sep2014 d_17oct2012 d_17jun2014
dFrock2012 dFrock2013 dFrock2014 dFrock2015 dFrock2016 dFrock2017 dFrock2018 dFrock2019
dYear2015 dYear2016 dYear2017
/p=7 q=2 Method=ml dftest;
garch p=1 q=1 form=ccc OUTHT=CONDITIONAL;
restrict
ar(3,1,1)=0, ar(4,1,1)=0, ar(5,1,1)=0,
XL(0,1,13)=0, XL(0,1,14)=0, XL(0,1,13)=0, XL(0,1,27)=0, XL(0,1,38)=0, XL(0,1,42)=0;
output lead=10 out=forecast;
运行;
与任何预测一样,您首先需要准备时间序列。您应该首先通过PROC TIMESERIES
浏览数据以填写或估算缺失值。最合适的插补选择取决于您的变量。下面的代码将:
lvisitors
,并将缺失值设置为0averagetemp
的缺失值设置为平均值rain
,sunshine
的缺失值和以d
开头的变量设置为0(假设它们是指标)代码:
proc timeseries data=have out=want;
id date interval = day
setmissing = 0
notsorted
;
var lvisitors / accumulate=total;
crossvar averagetemp / accumulate=none setmissing=average;
crossvar rain sunshine d: / accumulate=none;
run;
重要的时间间隔注意事项
根据您的数据,这可能会使您的错误率和估计值产生偏差,因为您始终知道在淡季期间不会有人在附近。如果淡季数据缺少许多值,则需要使用自定义时间间隔以确保不估算淡季值。
在下面的示例中,创建的a custom time interval q2q4month
排除了第一季度的所有月份。这样可以防止PROC TIMESERIES
在Q1的任何月份中都将air
设置为0。
/* SAMPLE DATA: Remove all dates in Q1 */
data air;
set sashelp.air;
where month(date) NOT IN(1,2,3);
run;
/* Create a custom month interval that excludes Q1*/
data q2q4month;
format begin monyy.;
do year = 1940 to year(today());
do mon = 4 to 12;
begin = mdy(mon, 1, year);
season = month(begin);
output;
end;
end;
keep begin season;
run;
/* Create the custom interval from the q2q4month dataset */
options intervalds=(q2q4month = work.q2q4month);
/* Accumulate to the custom interval */
proc timeseries data=air out=want;
id date interval = q2q4month
setmissing = 0
notsorted
;
var air / accumulate=total;
run;