program PseudoRealTimeForecast syntax varlist , date(varname) freq(name) target(varname) model(name) legend(name) [factors(integer 2) shocks(integer 1) varorder(integer 1) forecast(integer 1) copy delete verbose bridgevars(varlist) training(integer 60) easy without(varlist) with(varlist)] qui { /* Notes */ * This program only applies if the data is monthly. * But target variable can be either monthly or quarterly. /* options */ * date: date variable * freq: frequency of target variable (m or q for now) * target: target variable * model: name to prefix variables (used when copy is specified) * factors: number of factors * shocks: number of common shocks to the factors * varorder: VAR order, number of lags * legend: name of legend file, must be in the current working directory * forecast: number of forecasts to make after the end of the current data set (used to tsappend) * copy: if to copy specified variables and prefix the name with model name or replace all variables involved * delete: if variables not used are to be deleted, usually combined with copy to generate a data set containing only variables involved * verbose: if to display additional log information * bridgevars: variables to be put in the bridge equation. if specified, then the vairiables won't be used in factor extraction. * training: number of months used as training sample (always specified in months), default 5 years. * easy: do the easy exercise (always cut the data without recreating jagged edge); otherwise, do the better approximation and create the jagged edge based on publication lag information available from the legend. * (specific to this study, no need in general models) without: variables whose current observations are forcefully NOT added (e.g., the variable is treated as m-1) * (specific to this study, no need in general models) with: variables whose current observations are forcefully added (e.g., the variable is treated as m) /* example command */ * PseudoRealTimeForecast x1 x2 x3 x4, date(month) freq(M) target(rgdp) model(M1) factors(3) shocks(2) varorder(4) legend(legend) forecast(12) copy delete /* Basic Idea */ * The basic idea of the program is to back up the original data and use the back up data to add values when time progresses. * The time index can be manipulated to provide limitation on the amound of observations to use, as if or in condition is not easily implemented in Nowcast. * The overall flow is * 1. Keep only variables that are used. * 2. Generate backup copies of these variables. * 3. Cut the sample according to the "current" period. * 4. Process the sample and prepare to send to FactorExtraction. * 5. Send sample to FactorExtraction * 6. Use the factors to make forecasts (timing of bridge equations may need to be changed) * 7. Record forecasts in back up forecast variables * 8. Go back to Step 3 for the next period. /* Check bridge equation specification */ if "`bridgevars'"!="" { noi LogShow Variables to be put in bridge equation in addition to factors: `bridgevars' noi di as error "Out-of-sample forecasting results in change of timing in bridge equations according to forecast horizon!" } /* set verbose global */ global verbose `verbose' noi LogShow Check user specified variables and options /* Check frequency */ if lower("`freq'")=="m" { noi LogShow Target variable is monthly } else if lower("`freq'")=="q" { noi LogShow Target variable is quarterly } else { noi di as error "Unable to recognize specified frequency `freq'" exit 3000 } /* Keep only variables that are used */ if "`delete'"=="delete" { noi LogShow Delete unused variables keep `date' `target' `varlist' `bridgevars' } /* Add observations to hold forecasts */ su `date' local allmax = r(max) su `date' if `target'!=. local targetmax = r(max) if lower("`freq'")=="q" { local toadd = `forecast'*3 - (`allmax'-`targetmax') } else { local toadd = `forecast' - (`allmax'-`targetmax') } if `toadd'>0 { noi LogShow Add observations to hold forecasts tsappend, add(`toadd') } tsset `date' sort `date' /* Make backup copies of variables */ noi LogShow Back up all remaining variables foreach var of varlist _all { cap drop `var'_backup gen `var'_backup = `var' } /* Read legend file for publication lag */ * preserve current data noi LogShow Read legend file for publication lag and transformation code preserve * read legend file use `legend'.dta, clear * count number of entries count local max=r(N) * read for each entry the transformation code forvalues obs=1/`max' { local name = varname[`obs'] local transcode = transcode[`obs'] local publag = publag[`obs'] if "`copy'"=="copy" { local `name'_`model'=`transcode' local `name'P_`model'="`publag'" } else { local `name'=`transcode' local `name'P="`publag'" } } * get back to current data restore /* Start pseudo real time exercise */ * get the date for the end of the sample su `date' local enddate = r(max) local startdate = r(min) * set start date to start at the end of traning sample local startdate = `startdate' + `training' * looping over all dates starting from the end of training sample to end of sample noi LogShow Start the pseudo real time exercise looping over all periods * initialize progress bar local totaldate = `enddate'-`startdate'+1 * initialize variables holding forecasts * if monthly, only need as many variables as horizons if "`freq'"=="m" { forvalues h=1/`forecast' { cap drop `target'_pred`h' gen `target'_pred`h'=. } } * for quarterly, need three times as many, as there are three months in a quarter else if "`freq'"=="q" { forvalues h=1/`forecast' { cap drop `target'_m1pred`h' gen `target'_m1pred`h'=. } forvalues h=1/`forecast' { cap drop `target'_m2pred`h' gen `target'_m2pred`h'=. } forvalues h=1/`forecast' { cap drop `target'_m3pred`h' gen `target'_m3pred`h'=. } } forvalues currentdate = `startdate'/`enddate' { * progress bar if mod((`currentdate'-`startdate'+1), `totaldate'/10)<1 { noi di %2.0f (`currentdate'-`startdate'+1)/`totaldate'*100 "%" _continue } else if mod((`currentdate'-`startdate'+1), `totaldate'/50)<1 { noi di "." _continue } * prepare the edge of the data * restore from backup foreach var of varlist `varlist' `target' `date' `bridgevars' { replace `var' = `var'_backup } * most variables come with one period lag local cutdate = `currentdate' - 1 foreach var of varlist `date' `target' `varlist' `bridgevars' { * result in a balanced data up to and including cutdate replace `var'=. if `date'_backup>`cutdate' } * if easy exercise, that's all; otherwise, create jagged edge * publication lag is the last month with available data if we are at the end of month m. if "`easy'"=="" { foreach var of varlist `target' `varlist' `bridgevars' { * process variables according to publag if "``var'P'"=="m" { replace `var'=`var'_backup if `date'_backup==`currentdate' } else if "``var'P'"=="m-1" { * do nothing is lag is m-1 } else if "``var'P'"=="m-2" { replace `var'=. if `date'_backup==`currentdate'-1 } else if "``var'P'"=="m-3" { replace `var'=. if (`date'_backup==`currentdate'-1)|(`date'_backup==`currentdate'-2) } else if "``var'P'"=="q-1" { tempvar quarter2 cap drop `quarter2' gen `quarter2' = qofd(dofm(`date'_backup)) local currentquarter = qofd(dofm(`currentdate')) replace `var'=. if `quarter2'>=`currentquarter' } else { noi di as error "Unable to recognize publication lag ``var'P' for variable `var'" exit 3000 } replace `var'=. if `date'_backup>`cutdate'+1 } } * process "with" and "without" variables if "`without'"!="" { foreach var of varlist `without' { * make sure the variable is made UNAVAILABLE replace `var'=. if `date'_backup==`currentdate' } } if "`with'"!="" { foreach var of varlist `with' { * make sure the variable is made AVAILABLE replace `var'=`var'_backup if `date'_backup==`currentdate'|`date'_backup==`currentdate'-1|`date'_backup==`currentdate'-2 } } * transform independent variables * to avoid random sorting of observations with missing dates, when time series operators are needed, use backup dates temporarily. tsset `date'_backup sort `date'_backup foreach var of varlist `varlist' `target' { if "``var''"=="1" { * monthly growth rate cap drop `var'_tmp gen `var'_tmp = ((`var'-l.`var')/l.`var')*100 in 13/l replace `var' = `var'_tmp cap drop `var'_tmp } else if "``var''"=="2" { * monthly differences cap drop `var'_tmp gen `var'_tmp = d.`var' in 13/l replace `var' = `var'_tmp cap drop `var'_tmp } else if "``var''"=="3" { * monthly diff of yearly growth rate cap drop `var'_tmp gen `var'_tmp = ((`var'-l12.`var')/l12.`var')*100 in 13/l replace `var' = `var'_tmp cap drop `var'_tmp } else if "``var''"=="0" { * no transform } else { noi di as error "Unable to recognize transformation code ``var'' of variable `var'" exit 3000 } } * set initial 12 observations to missing due to transformation foreach var of varlist `date' `target' `varlist' `bridgevars' { replace `var'=. in 1/12 } * Trasform monthly diffenences (growth rates) in quarterly equivalents if lower("`freq'")=="q" { foreach var of varlist `varlist' { cap drop `var'_tmp gen `var'_tmp = `var'+2*l.`var' if `var'!=.&(`var'+2*l.`var')!=.&_n-12==2 replace `var'_tmp = `var'+2*l.`var'+3*l2.`var' if `var'!=.&(`var'+2*l.`var'+3*l2.`var')!=.&_n-12==3 replace `var'_tmp = `var'+2*l.`var'+3*l2.`var'+2*l3.`var' if `var'!=.&(`var'+2*l.`var'+3*l2.`var'+2*l3.`var')!=.&_n-12==4 replace `var'_tmp = `var'+2*l.`var'+3*l2.`var'+2*l3.`var'+l4.`var' if `var'!=.&(`var'+2*l.`var'+3*l2.`var'+2*l3.`var'+l4.`var')!=.&_n-12>5 replace `var' = `var'_tmp if `var'!=.&`var'_tmp!=. cap drop `var'_tmp } } * done using temporary use of backup dates tsset `date' sort `date'_backup * Make user data starts from first month of a quarter if !(month(dofm(`date'[13]))==1|month(dofm(`date'[13]))==4|month(dofm(`date'[13]))==7|month(dofm(`date'[13]))==10) { foreach var of varlist `date' `target' `varlist' `bridgevars' { replace `var'=. in 13 } if !(month(dofm(`date'[14]))==1|month(dofm(`date'[14]))==4|month(dofm(`date'[14]))==7|month(dofm(`date'[14]))==10) { foreach var of varlist `date' `target' `varlist' `bridgevars' { replace `var'=. in 14 } } } * Uses only the series for which there are less than 1/3 of missing data count if `date'!=. local total=r(N) foreach var of varlist `varlist' `bridgevars' { count if `var'==.&`date'!=. local miss = r(N) if `miss'/`total'>1/3 { local missvar "`missvar' `var'" } } if length("`missvar'")!=0 { noi di as error "There are variables with more than 1/3 of missing data:" noi di as error "`missvar'" exit 3000 } * Outliers Correction su `date' if `target'!=.&`date'!=. local MaxDateToCorrect=r(max)-12 local StartDateToCorrect=r(min) OutliersCorrection `varlist', date(`date') end(`MaxDateToCorrect') start(`StartDateToCorrect') * Restore dates to observations to hold forecasts su `date' if `target'!=. local targetmax = r(max) if lower("`freq'")=="q" { local addto = `forecast'*3 + `targetmax' } else { local addto = `forecast' + `targetmax' } replace `date'=`date'_backup if `date'_backup>`targetmax'&`date'_backup<=`addto' * Set up new date variable for regression if lower("`freq'")=="q" { tempvar quarter cap drop `quarter' gen `quarter' = qofd(dofm(`date')) if (month(dofm(`date'))==3|month(dofm(`date'))==6|month(dofm(`date'))==9|month(dofm(`date'))==12)&`date'!=. local date2 "`quarter'" * define targetmax2 using new date variable su `date2' if `target'!=. local targetmax2 = r(max) } else { local date2 "`date'" } * Factor Extraction * FactorExtraction `varlist', q(`shocks') r(`factors') p(`varorder') name(EstFct) date(`date') * Forecast and store results cap drop `target'_pred * forecast for all horizons by changing time of the bridge equation forvalues h = 1/`forecast' { * construct forward bridgevars list * if need one period forward then do one period forward if "`with'"!=""&"`with'"=="`bridgevars'" { if "`bridgevars'"!="" { local fbridgevars " " tempvar csortorder cap drop `csortorder' gen `csortorder'=_n sort `date'_backup foreach var of varlist `bridgevars' { tempvar f`var' cap drop `f`var'' gen `f`var'' = `var'[_n+1] local fbridgevars "`fbridgevars' l`h'.`f`var''" } sort `csortorder' drop `csortorder' } } if "`without'"!=""&"`without'"=="`bridgevars'" { * add lags to variable list local fbridgevars " " foreach var of varlist `bridgevars' { local fbridgevars "`fbridgevars' l`h'.`var'" } } * regress with lag tsset `date2' sort `date2' reg `target' EstFct* `fbridgevars' if `date2'!=. tempvar pred cap drop `pred' predict `pred', xb * record result if lower("`freq'")=="m" { * when regression is monthly replace `target'_pred`h' = `pred' if `date'==`targetmax'+`h' } else { * when regression is quarterly if month(dofm(`currentdate'))==1|month(dofm(`currentdate'))==4|month(dofm(`currentdate'))==7|month(dofm(`currentdate'))==10 { replace `target'_m1pred`h' = `pred' if qofd(dofm(`date'))==`targetmax2'+`h'&(month(dofm(`date'))==3|month(dofm(`date'))==6|month(dofm(`date'))==9|month(dofm(`date'))==12) } else if month(dofm(`currentdate'))==2|month(dofm(`currentdate'))==5|month(dofm(`currentdate'))==8|month(dofm(`currentdate'))==11 { replace `target'_m2pred`h' = `pred' if qofd(dofm(`date'))==`targetmax2'+`h'&(month(dofm(`date'))==3|month(dofm(`date'))==6|month(dofm(`date'))==9|month(dofm(`date'))==12) } else if month(dofm(`currentdate'))==3|month(dofm(`currentdate'))==6|month(dofm(`currentdate'))==9|month(dofm(`currentdate'))==12 { replace `target'_m3pred`h' = `pred' if qofd(dofm(`date'))==`targetmax2'+`h'&(month(dofm(`date'))==3|month(dofm(`date'))==6|month(dofm(`date'))==9|month(dofm(`date'))==12) } } } * sort back tsset `date' sort `date'_backup * end looping over all dates } * end progress bar noi di " " noi LogShow End looping over all periods * restore data from backup noi LogShow Restore data from backup foreach var of varlist `varlist' `target' `date' `bridgevars' { replace `var' = `var'_backup } * remove back up variables drop *_backup * end quietly } end program LogShow if "$verbose"=="verbose" { noisily display as text "[" c(current_time) "] " "`0'" } end