10.8 USING PRIOR INFORMATION Prior information can be very useful in making forecasts. Fildes and Stevens (1978) show that in a situation where there is little or no data prior information can be important. Section 10.8 of this paper will examine the effects of feeding prior information into the estimation process. The sensitivity of the results will be examined by feeding in different priors. The Kalman filter can be used to incorporate prior information into the estimation process. This is done very easily. In all the estimations done so far, the initial values of the mean and variance of the parameters were set to 0 and 100; this corresponds to great prior uncertainty. In this part of the paper, the prior mean and variance for the income, wool price and synthetic fibre price elasticity will be set to values that represent the introduction of prior information. In all of the runs done in part 10.8, V was set to values for final and preliminary data. 10.8.1 THE PRIOR INFORMATION, CASE 1 A reasonable a priori value for income (i.e. consumer expenditure) elasticity would be 1.0; obviously the average expenditure elasticity (averaged over all consumer goods and services) must be 1.0, and there is no strong prior reason why wool or total fibres should be very different from this average. In addition, various work done in the area of fibre demand (reviewed in Newland, 1979) points to an elasticity of around 1. A reasonable a priori value for the wool price elasticity is -0.5; various other values (between, say, -0.1 and -1.0) might also be acceptable. It also seems plausible that the cross elasticity with respect to synthetic fibre price is equal in magnitude (but opposite in sign) to the wool price elasticity; i.e. 0.5. Thus the prior mean for the three elasticities will be set to 1.0, -0.5, 0.5. We specify the strength of this prior information by specifying the variance of the prior estimates. It is easier, however, to think in terms of the standard deviation. A standard deviation of 0.2 is large enough to give sufficient leeway for the Kalman filter to be able to modify the parameter estimates, yet not so large that the prior information will immediately be swamped by the data. This means that the prior variance = 0.04. In a real life situation, a consideration of the source and quality of the prior information would give a figure for the prior variance. 10.8.1.1 PRIOR DATA CASE 1, W = 10-3 The runs of 10.5.2 were repeated with the prior information of 10.8.1. The results are given below. Table 60 C D E DSSE Belgium 1.041 (.24) -.247 (.12) .507 (.21) .0188 France 1.057 (.23) -.307 (.12) .469 (.20) .0088 Germany 1.029 (.23) -.292 (.12) .421 (.20) .0133 Holland 1.058 (.23) -.228 (.13) .343 (.21) .0750 Italy .824 (.16) -.419 (.18) .465 (.22) .0693 Japan 1.040 (.17) -.395 (.18) .449 (.22) .1143 UK 1.056 (.21) -.236 (.08) .408 (.15) .0662 USA 1.023 (.22) -.543 (.10) .637 (.15) .1991 AVERAGE 1.016 -.333 .462 .0706 The DSSE is 9% worse than the DSSE of 10.5.2, and the parameter estimates are very different. Where the standard errors of the estimates of 10.5.2 are small (e.g. the UK wool price elasticity), the estimates of 10.8.1.1 are very close to those of 10.5.2. This is because a small standard error implies lots of relevant information in the data, and so the prior is largely ignored. In the case of the UK wool price elasticity, the 10.5.2 estimate is -0.206 and the 10.8.1.1 estimate is -0.236. When the standard errors of the estimates of 10.5.2 are large (for example, UK income elasticity), the estimates of 10.8.1.1 are close to the prior information; the reason for this is exactly the same as above, but in reverse. 10.8.1.2 PRIOR DATA CASE 1, W = 10-2 The estimations of 10.8.1.1 were repeated, but with W = 10-2. This would give lower weight to older information, and so increase the speed at which the prior information became irrelevant. In 10.5.1 above, W = 10-2 was tried without prior information (great prior uncertainty). The results included a lot of wrongly signed parameters. Table 61 C D E DSSE Belgium 1.103 (.34) -.324 (.26) .440 (.42) .0219 France 1.078 (.28) -.399 (.27) .423 (.41) .0098 Germany 1.031 (.27) -.331 (.27) .440 (.42) .0200 Holland 1.091 (.27) -.265 (.27) .344 (.42) .0858 Italy .969 (.39) -.483 (.34) .468 (.39) .0789 Japan 1.232 (.44) -.486 (.34) .449 (.39) .1352 UK 1.053 (.24) -.226 (.22) .474 (.38) .0584 USA 1.039 (.25) -.497 (.23) .546 (.39) .1578 AVERAGE 1.074 -.376 .448 .0710 Many of the countries' DSSE have got worse. But the USA's, DSSE is very good. Although the USA elasticities are very close to the prior elasticities, this is not because the prior information is too strong. For example, the estimate of the synthetic fibre price elasticity rises to 0.786 in 1974, before falling back to 0.546. So the improvement in the USA must come from the following effect. In the 1960's, wool prices, income and synthetic fibre prices were all highly multicollinear (see Solomon, 1980), so the elasticity estimates are imprecise. This does not affect forecasting, however, as long as these variables remain collinear. In the 1970's, this collinearity ended as wool prices rose very fast indeed, then fell. If the parameter estimates are imprecise just before then, the forecast will be highly inaccurate and make a large contribution to DSSE. If, however, some prior information has broken the multicollinearity (see Solomon, 1980) then the forecasts of the early 1960's are barely affected, and the forecasts of the 1970's are much improved. This is what has happened in this case. 10.8.1.3 PRIOR DATA CASE 1, W = 10-4 The estimations were again repeated, but with W = 10-4, which would give more weight to older information. Table 62 C D E DSSE Belgium .879 (.16) -.247 (.07) .556 (.12) .0238 France .915 (.18) -.256 (.05) .512 (.11) .0148 Germany 1.000 (.19) -.295 (.07) .332 (.11) .0069 Holland .878 (.19) -.211 (.08) .364 (.12) .0650 Italy .745 (.08) -.303 (.10) .398 (.15) .0539 Japan .818 (.07) -.191 (.10) .317 (.15) .0739 UK 1.049 (.20) -.317 (.03) .322 (.07) .0854 USA 1.047 (.20) -.721 (.06) .840 (.07) .3550 AVERAGE .916 -.318 .455 .0848 The average DSSE is rather worse than 10.8.1.1; this overall view conceals the fact that there is a large deterioration in the USA DSSE. 10.8.1.4 PRIOR DATA CASE 1, W = 0.0 Just to see what would happen, a run was done for the prior information of 10.8.1 with W = 0. Table 63 C D E DSSE Belgium .787 (.09) -.269 (.06) .593 (.07) .0261 France .673 (.04) -.248 (.03) .716 (.05) .0200 Germany .867 (.07) -.321 (.05) .297 (.05) .0056 Holland .351 (.07) -.214 (.06) .436 (.06) .0840 Italy .670 (.06) -.367 (.04) .432 (.06) .0787 Japan .641 (.03) -.253 (.02) .262 (.03) .1776 UK .163 (.07) -.425 (.02) .300 (.04) .1950 USA .817 (.08) -.905 (.04) .955 (.04) .7278 AVERAGE .621 -.375 .499 .1644 The average DSSE is very much worse than when W was 10-2, 10-3 or 10-4. This is mostly because of the large increase in the DSSE for the USA. This confirms the hypothesis of a parameter shift in the USA, as with a zero W, a shift in the parameters cannot be accomodated as well as with a non-zero W. 10.8.2 THE PRIOR INFORMATION, CASE 2 The prior information used in this part of the paper is the same as 10.8.1, except that the standard error is doubled (the variance is quadrupled). Thus it will be possible to see what happens when less precise prior information is used. 10.8.2.1 PRIOR DATA CASE 2, W = 10-3 Table 64 C D E DSSE Belgium .981 (.38) -.188 (.13) .504 (.26) .0212 France 1.009 (.41) -.252 (.13) .466 (.26) .0132 Germany 1.023 (.41) -.253 (.13) .342 (.25) .0096 Holland .968 (.42) -.169 (.14) .205 (.27) .0698 Italy .765 (.18) -.331 (.24) .423 (.34) .0617 Japan 1.019 (.18) -.268 (.25) .397 (.36) .0925 UK 1.042 (.41) -.218 (.08) .364 (.17) .0701 USA 1.039 (.41) -.507 (.11) .658 (.16) .2150 AVERAGE .981 -.273 .420 .0691 The average DSSE is 2% lower than case 1 when this weaker prior information is used. This rather interesting result shows that weaker prior information can result in better forecasting performance; this would be expected when parameters are drifting, or when the prior information is wrong. 10.8.2.2 PRIOR DATA CASE 2, W = 10-2 Table 65 C D E DSSE Belgium 1.086 (.53) -.261 (.30) .473 (.51) .0186 France 1.063 (.49) -.344 (.31) .446 (.51) .0085 Germany 1.021 (.47) -.289 (.31) .429 (.51) .0174 Holland 1.070 (.47) -.219 (.30) .283 (.52) .0849 Italy .953 (.44) -.457 (.43) .468 (.51) .0766 Japan 1.228 (.50) -.449 (.45) .451 (.53) .1289 UK 1.040 (.43) -.212 (.23) .452 (.41) .0603 USA 1.021 (.45) -.443 (.25) .573 (.42) .1644 AVERAGE 1.060 -.334 .447 .0700 Just as in 10.8.1, an increase in W gives a worsening of the DSSE (.0700 compared with .0691). The DSSE for the USA, however, is much better, which reinforces the hypothesis that the parameters have changed during the estimation period. 10.8.2.3 PRIOR DATA CASE 2, W = 10-4 Table 66 C D E DSSE Belgium .801 (.19) -.219 (.07) .548 (.13) .0271 France .798 (.24) -.240 (.06) .509 (.12) .0192 Germany .968 (.27) -.276 (.07) .289 (.12) .0064 Holland .659 (.27) -.176 (.08) .311 (.13) .0688 Italy .712 (.09) -.240 (.11) .304 (.20) .0493 Japan .807 (.07) -.101 (.11) .173 (.19) .0545 UK 1.031 (.35) -.310 (.04) .301 (.07) .0873 USA 1.119 (.32) -.725 (.06) .867 (.07) .3686 AVERAGE .862 -.286 .413 .0852 Just as in 10.8.1 when W = 10-4, the DSSE's are much worse than for W = 10-3, especially for the USA. 10.8.3 THE PRIOR INFORMATION, CASE 3 In this series of runs, the prior information was as in 10.8.1, but the standard deviation is a half (the variance a quarter) of the 10.8.1 values. Thus the prior information is much stronger. 10.8.3.1 PRIOR DATA CASE 3, W = 10-3 Table 67 C D E DSSE Belgium 1.074 (.15) -.327 (.10) .479 (.16) .0191 France 1.075 (.13) -.377 (.10) .444 (.15) .0084 Germany 1.031 (.12) -.341 (.10) .461 (.16) .0165 Holland 1.085 (.12) -.294 (.11) .415 (.16) .0750 Italy .884 (.14) -.468 (.12) .471 (.15) .0740 Japan 1.079 (.16) -.466 (.13) .452 (.15) .1271 UK 1.055 (.11) -.260 (.08) .452 (.13) .0641 USA 1.026 (.12) -.601 (.09) .592 (.14) .1809 AVERAGE 1.039 -.392 .471 .0706 There is very little difference between these results and those with the prior information of case 1. 10.8.4 THE PRIOR INFORMATION, CASE 4 In this series of runs, the prior information was as in 10.8.1, but the wool and synthetic fibre price elasticities were set to -0.2 and 0.2 (compared with -0.5 and 0.5). The prior variances are set to 0.04. 10.8.4.1 PRIOR DATA CASE 4, W = 10-3 Table 68 C D E DSSE Belgium 1.035 (.24) -.185 (.12) .296 (.21) .0071 France 1.057 (.23) -.238 (.12) .253 (.20) .0051 Germany 1.032 (.23) -.228 (.12) .212 (.20) .0058 Holland 1.060 (.23) -.157 (.13) .124 (.21) .0481 Italy .822 (.16) -.217 (.18) .179 (.22) .0198 Japan 1.039 (.17) -.190 (.18) .154 (.22) .0529 UK 1.059 (.21) -.201 (.08) .307 (.15) .0535 USA 1.027 (.22) -.486 (.10) .537 (.15) .1643 AVERAGE 1.016 -.238 .258 .0446 The average DSSE is 37% lower than in 10.8.1.1; a considerable improvement, which is shared by all the countries. The prior information on the income elasticity has not been altered, so it is not surprising to find that the income elasticities have barely changed. The prior information on the wool and synthetic fibre price elasticities has been changed substantially, and (as we might expect) this has changed the final estimates. But the average wool price elasticity has changed much less than the average synthetic fibre price elasticity, which would indicate that, in general, there is more information in the data about the effect of wool price changes than about the effect of synthetic fibre price changes. This is probably because the price of wool fluctuates much more violently over this period than does the synthetic fibre price (see appendix F). 10.8.4.2 PRIOR DATA CASE 4, W = 10-2 Table 69 C D E DSSE Belgium 1.104 (.34) -.220 (.26) .186 (.42) .0068 France 1.080 (.28) -.268 (.27) .153 (.41) .0071 Germany 1.033 (.27) -.225 (.27) .185 (.42) .0062 Holland 1.093 (.27) -.158 (.27) .089 (.42) .0470 Italy .967 (.39) -.234 (.34) .168 (.39) .0208 Japan 1.232 (.44) -.234 (.34) .141 (.39) .0614 UK 1.060 (.24) -.163 (.22) .307 (.38) .0404 USA 1.046 (.25) -.413 (.23) .364 (.39) .1059 AVERAGE 1.077 -.239 .199 .0370 When the older information is allowed to decay more quickly, slightly better forecasts are obtained than in 10.8.4.1. The wool price elasticity is unchanged, but the synthetic fibre price elasticity is lower. This indicates that the synthetic fibre price elasticity is lower at the end of the time period than at the beginning. 10.8.4.3 PRIOR DATA CASE 4, W = 10-4 Table 70 C D E DSSE Belgium .856 (.16) -.213 (.07) .457 (.12) .0189 France .909 (.18) -.235 (.05) .422 (.11) .0096 Germany .999 (.19) -.261 (.07) .251 (.11) .0065 Holland .871 (.19) -.170 (.08) .270 (.12) .0576 Italy .740 (.08) -.206 (.10) .255 (.15) .0248 Japan .815 (.07) -.102 (.10) .133 (.15) .0444 UK 1.047 (.20) -.304 (.03) .286 (.07) .0806 USA 1.046 (.20) -.694 (.06) .811 (.07) .3484 AVERAGE .910 -.273 .354 .0739 The average DSSE is much worse than when W is 10-3. The higher average synthetic fibre price elasticity confirms the hypothesis that this elasticity is lower at the end of the time period; the lower W allows the earlier data to have more weight, and that has pulled the elasticity up. 10.8.5 THE PRIOR INFORMATION, CASE 5 In this series of runs, the prior information was as in 10.8.4, but with the prior variances set to 0.16; the prior information is weaker. 10.8.5.1 PRIOR DATA CASE 5, W = 10-3 Table 71 C D E DSSE Belgium .967 (.38) -.164 (.13) .396 (.26) .0132 France 1.007 (.41) -.225 (.13) .356 (.26) .0073 Germany 1.024 (.41) -.227 (.13) .237 (.25) .0063 Holland .966 (.42) -.141 (.14) .091 (.27) .0546 Italy .762 (.18) -.203 (.24) .195 (.34) .0223 Japan 1.017 (.18) -.133 (.25) .153 (.34) .0484 UK 1.042 (.41) -.205 (.08) .323 (.17) .0640 USA 1.039 (.41) -.486 (.11) .620 (.16) .2004 AVERAGE .978 -.223 .296 .0521 Comparing this run with 10.8.4.1, we can see that the weakening of the prior information has made the DSSE worse, whereas (comparing 10.8.2.1 with 10.8.1.1) with the higher price elasticities, weakening the prior information improved the DSSE. This leads us to the conclusion that this prior information is better than the prior information of 10.8.1 to 10.8.3. 10.8.5.2 PRIOR DATA CASE 5, W = 10-2 Table 72 C D E DSSE Belgium 1.086 (.53) -.192 (.30) .248 (.51) .0059 France 1.065 (.49) -.249 (.31) .203 (.51) .0058 Germany 1.024 (.47) -.219 (.31) .203 (.25) .0063 Holland 1.073 (.47) -.148 (.30) .056 (.52) .0500 Italy .951 (.47) -.228 (.43) .170 (.51) .0209 Japan 1.227 (.50) -.216 (.45) .147 (.53) .0596 UK 1.045 (.43) -.167 (.23) .326 (.41) .0453 USA 1.026 (.45) -.384 (.25) .442 (.42) .1234 AVERAGE 1.062 -.225 .224 .0397 When the W is increased, the DSSE improves (mainly an improvement for the USA). The average wool price elasticity is unchanged, but the synthetic fibre price elasticity has changed a little. 10.8.5.3 PRIOR DATA CASE 5, W = 10-4 Table 73 C D E DSSE Belgium .788 (.19) -.209 (.09) .516 (.13) .0252 France .795 (.24) -.234 (.06) .481 (.12) .0168 Germany .967 (.27) -.266 (.07) .264 (.12) .0063 Holland .654 (.27) -.164 (.08) .281 (.13) .0662 Italy .709 (.09) -.203 (.11) .219 (.20) .0338 Japan .805 (.07) -.068 (.11) .096 (.19) .0428 UK 1.029 (.35) -.306 (.04) .291 (.07) .0859 USA 1.118 (.32) -.717 (.06) .858 (.07) .3666 AVERAGE .858 -.271 .376 .0805 Reducing the rate at which data become irrelevant has worsened the DSSE. This has given more weight to the older data. The USA is particularly hard hit by this, because of the large change in the model's parameters over the estimation period that has been demonstrated by the earlier runs. 10.9 CONCLUSIONS FROM THE ESTIMATIONS WITH PRIOR INFORMATION Clearly prior information can be useful, but not greatly so when there is a lot of information in the data. The best estimations with prior information had a DSSE of .037, compared to .038 for the best estimation without prior information. Prior information is of greater value in forecasting at earlier time periods, as at earlier time there is less information in the data, and the prior information is correspondingly more useful. Again, all of the Kalman filter estimations with prior information were much better (lower DSSE) than the OLS estimation. 10.10 CONCLUSIONS 1. Recognising that the pre-1970 data were less precise than the post-1970 data led to improved forecasts. This was true in all the runs that were repeated both with and without this assumption (see, for example, 10.4.2). This is evidence for the hypothesis that it is better to recognize the variable precision of data, and take appropriate action. 2. When the relevance of old data decays with time (non-zero W), the forecasts are much more accurate than when all data are assumed to have equal relevance (see 10.5.2). This lends weight to the suggestion that old data be given less weight than more current data. 3. The Kalman filter is able to forecast the past much better than OLS for those countries with bad forecasts under OLS. If the Kalman filter is supplied with prior information then the Kalman filter mean forecasting error was about the same (see the runs of chapter 10). 4. When the DSSE from the Kalman filter is compared with the DSSE from OLS, we can see that: - All of the Kalman filter models show an average improvement in forecasting accuracy. - The OLS-estimated model has a DSSE more than five times greater than the best Kalman filter-estimated models. - This average change reflects the substantial improvement in the case of countries for which the OLS method performs badly because of drifting or unstable parameters. Where there is greater stability, the use of the Kalman filter may produce slightly less forecasting accuracy than OLS, but any deterioration is very slight. 5. The performance of the Kalman filter is not extremely sensitive to choice of W. Order of magnitude increases or decreases in W do not affect the forecasting performance unduly (see 10.5.1). 6. Combining two time series into a common estimation process with restrictions across the equations is done very simply, as is the incorporation of any available prior information into the estimation. 7. When the lower precision of preliminary data is taken account of, we can expect an improvement in the model's forecasting ability (see 10.4). 8. The Kalman filter copes successfully with shocks to the system that might be expected to affect the parameter values (see 10.6.10).