National institute economic forecasts 1968 to 1991: some tests of forecast properties.
Pain, Nigel ; Britton, Andrew
Introduction
From time to time the Institute has published in the Review
assessments of its economic forecasts based on statistical analysis of
their relation with the outturn data. The most recent examples are
Savage (1983) for GDP forecasts and NIESR (1984) for inflation. A more
extensive analysis has now been carried out covering quarterly forecasts
over the period from 1968 to 1991 for a range of economic variables. The
full results are reported in Pain and Britton (1992). This note
summarises some of the main findings.
At the present time there is particular concern over recent
forecasting performance. In common with most other forecasters the
Institute substantially underpredicted the strength of the economy in
the late 1980s and also failed to foresee the onset and duration of the
recent recession. This experience suggests the need to re-evaluate the
conclusions of earlier studies of this kind.
In the next section of this note we discuss the Institute's
forecasts of some key macroeconomic variables, illustrated with charts
of forecast and outturn. The third section reports the main findings of
the statistical analysis, which tests the efficiency and bias of the
forecasts. It includes tests over the period as a whole and over three
sub-samples. The fourth section considers the relationship between
forecast errors and data revisions.
The results are consistent with earlier studies in showing that
generally Institute forecasts are unbiased and contain information
useful for prediction. This can be shown for the period as a whole,
including the 1980s, and tests for a deterioration in the forecasts in
recent years are not conclusive. Nevertheless the results for some key
variables in recent years are disappointing. It would appear that
forecasting has become substantially more difficult, probably because of
shifts in underlying behavioural relationships which have not yet been
fully incorporated into econometric models.
National Institute forecasts
This study focuses on National Institute forecasts of the annual
growth in real expenditure, inflation, average earnings, employment and
real personal disposable income. We examine both aggregate measures,
such as Gross Domestic Product and domestic demand, and individual
components of expenditure. The forecasts are all for growth between two
particular quarters. This provides a stricter test of forecasting
performance than the use of forecasts for growth in a calendar year
since the latter contain an important element of estimated outturn.
The forecasts are taken from successive quarterly issues of the
National Institute Economic Review over the period from
1968Q1-l990Q2(1). We treat the forecasts as a consistent set even though
the structure of the Institute's domestic macroeconometric model
has changed considerably over time, with advances in econometric
techniques, economic theory, the extent of sectoral disaggregation and
the treatment of expectations.
Previous studies of the Institute forecasts have considered a
narrower range of variables over a smaller sample period. For example,
Holden and Peel (1985) looked at real expenditure and inflation
forecasts over the second half of the 1970s alone. Use of a wider range
of variables and a larger sample enables us to ask whether forecasting
performance has changed over time and whether any observed forecasting
difficulties arise from particular variables.
The optimal base point for the annual growth forecasts is
determined by the |lnformation lag' - the time between the last
available published outturn data and the time at which the forecast is
undertaken. In general a forecast for time t can be thought of as being
made at time t-i based on information up till t-i-j, where j is the
information lag. In the UK the typical information set consists of full
information for t-i-2, plus partial information for t- i- 1 and some
early (usually financial) information for t-i. Holden and Peel (1985)
use t-i-2 as the base point as this reflects the lags in the publication
of a full set of National Accounts figures. In this study we have chosen
to proceed on the assumption that forecasters have a reasonable picture
of activity in the quarter prior to the forecast. For many years the
Institute distinguished data for t- i- 1 as an 'estimate' and
figures for t-i as being a forecast. Whilst publication lags vary over
time, it is generally the case that the forecaster has knowledge of
policy settings, retail sales, conditions in the labour market, the
balance of payments and retail prices for most, if not all of t-i-1. We
therefore concentrate on growth between t-i-1 and t, with i=3.(2)
Whilst the use of forecasts made over the period 1968Q1-l990Q2
yields a sample of 90 observations, it is important to remember that the
observations are not fully independent as we focus upon four-quarter
growth forecasts. In cases where the sampling interval is smaller than
the forecast interval the error term will exhibit serial correlation via
a moving average process of order (i+j-1) for a forecast i+j steps
ahead. This means that whilst the coefficients obtained using OLS may be
unbiased, the covariance matrix will be inconsistent. To overcome this
we follow Hansen and Hodrick (1980) in using the Generalised Method of
Moments technique to correct the covariance matrix. A similar correction
is utilised by Brown and Maital (1981) and Holden and Peel (1985)
amongst others. For the forecasts described above the error will follow
a MA(3) process.(3)
The NIESR forecasts of selected variables are shown in Charts 1-10
along with the latest available outturn (consistent with the July 1991
issue of Economic Trends). Perhaps the most striking feature of these
charts is that for many variables the forecasts have succeeded in
anticipating the direction of the short term variation in the growth
rate. What the forecasts fail to appreciate is the extent of the
variation so that the resulting forecast errors appear to be strongly
procyclical.
Five episodes are of interest over this period: the rapid growth in
demand in 1973 and 1987/88 (with the subsequent inflationary pressures)
and the recessions of 1974/5, 1980/81 and 1991. In some respects the
errors associated with the onset of the latest recession may be worse
than those of 1980/81 as the downturn in overall activity was not
foreseen at all. In contrast the errors in 1980 resulted from a failure
to forecast the depth of the recession.
The recent poor performance in forecasting total activity does not
appear to be entirely due to errors in forecasting domestic expenditure.
Chart 2 shows that the forecasts made for domestic demand did pick up
some of the downturn in 1990, although the errors are again positively
correlated with the cycle. This is particularly true for 1988, with an
error of over 5 per cent in the first quarter of the year. This by
itself was not unprecedented; what was unusual was the sustained run of
under predictions, last seen to any extent around 1973. Some of the
divergence between the GDP and domestic demand forecast errors is
accounted for by the export volume forecasts shown in Chart 5, with the
errors made towards the end of the 1980s on a par with those made in the
early 1980s at a time of a similar appreciation in the real exchange
rate. The errors made in the forecasts of import volumes shown in Chart
6 closely reflect those made in the domestic demand forecasts.
The consumer price inflation forecasts shown in Chart 7 suggest
that forecast errors became considerably smaller in the later half of
the 1980s, although this is likely to have been helped by the reduced
volatility of inflation. The 1970s were notable for a tendency to
underpredict inflation. Similar trends are apparent in the average
earnings forecasts shown in Chart 8. The employment forecasts in Chart 9
display a tendency to underestimate the fluctuations in the growth
rate,(4) possibly reflecting the errors made in the GDP forecasts.
Finally the forecasts made for real personal disposable income are given
in Chart 10. Here there does not appear to have been any noticable
change in forecasting performance over time.
Regression analysis of outturn and forecast
The main aim of the regression analysis is to examine whether the
forecasts described above can be said to satisfy the minimum
requirements expected of an optimal (or rational) forecast. A necessary
condition for rationality is that the forecast is an unbiased estimate
of the realised value so that:
O = E([A.sub.t - t-1][F.sub.t]/[I.sub.t-1]) (1) where
A, is the actual outturn [F.sub.t.t-t] is the forecast made at time t-i,
[I.sub.t-t] is the information set available at the time of the forecast
and E denotes an expectations operator.
The standard test of rationality is to test the joint hypothesis
[alpha] = 0 and [beta] = 1 in (2):
[A.sub.t] = [alpha] + [beta.sub.t-t.F.sub.t] + [epsilon.sub.t]
(2)
Rejection of the null implies that the forecasts could be improved
by knowledge of the [alpha] and [BETA] parameters and therefore provides
evidence of inefficiency in the forecast (Mincer and Zarnowitz, 1969).
In particular if [BETA] [is not equal to]1, then the forecast and the
forecast error must be correlated and therefore it would be possible to
improve the forecast by exploiting the correlation.(5) Equation (2) is
also often presented as a test of unbiasedness. However Holden and Peel
(1990) show the null that [alpha]=O and [BETA] =1 is only a sufficient
condition for the absence of bias. A necessary and sufficient condition
is given by [alpha]=(1 - [BETA]).E([F.sub.t.t-1]). Thus it is possible
to test for bias by testing the null that [alpha]=O in (3):
[A.sub.t] - [F.sub.t.t-1] = [alpha] + [epsilon.sub.t]
(3)
It is also useful to amend (2) and (3) so as to investigate whether
the results obtained are sensitive to particular unanticipated exogenous shocks such as strikes and major legislative changes. Such unforseen
(and arguably unforecastable) events can generate large outliers in the
observed forecast errors. Dummies are used below for the dock strikes of
1970, 1972 and 1979, the three-day week of 1974, the VAT shift in 1979
and the introduction of the poll tax in 1990. The impact of the
three-day week can be seen in Chart 1 which shows the forecasts for GDP.
The amended equations take the form:
[A.sub.t] = [alpha] + [BETA.sub.t] [F.sub.t.-t] + [delta.sub.i]
DUM(i) + [epsilon.sub.t] (2a)
[A.sub.t] - [F.sub.t.t-1] = [alpha] + [sigma.sub.i]DUM(i) +
[epsilon.sub.t]
(3a)
Dummy variables for all of these events are commonly found in most
UK macroeconomic models.
It is also possible to test whether forecast performance has
changed significantly over time. This may be done by writing (2) as:
[A.sub.t] = [alpha.sub.O] + [alpha.sub.1] D + [BETA.sub.Ot.t] -
[F.sub.t] + [BETA.ub.1.D.sub.t-i.F.sub.t] + [epsilon.sub.t] (4)
where D is a dummy variable equal to 1 from the time of the break point
onwards. Equation (4) is estimated over the whole sample. The absence of
a structural break involves a joint test of [alpha.ub.1] = [BETA.ub.1] =
O.
The results of applying (2) and (2a) to the quarterly forecasts
made since 1968 are summarised in Table 1. The outturn data used are the
most recently available. The respective null hypotheses are tested using
Wald [chi.sup.2] statistics obtained from the amended covariance matrix
with the covariance matrix derived using the Hansen-Hodrick procedure.
Absolute values of the resulting t-statistics are given in
parentheses(6). Results are also given for 3 sub-samples of 30
observations each in order to assess changes in performance over time.
[TABULAR DATA OMITTED]
Over the period as a whole the hypothesis of efficiency cannot be
rejected with regard to GDP growth, domestic demand, investment,
exports, imports, inflation, earnings or employment. In all these cases
the intercept terms ([alpha]) are not statistically significant, and the
slope coefficients ([BETA]) are not significantly different from unity.
The hypothesis of efficiency can be rejected however for the forecasts
of consumer spending and real personal disposable income. In both cases
there is some evidence of systematic underprediction over the period as
a whole.
The correlation coefficients indicate the proportion of the
variation in the outturn which is |explained' by the forecast. This
is one way of measuring how useful the forecasts have been, or the
extent to which they have reduced the subjective uncertainty of future
events. For GDP the coefficient is about one-third over the period as a
whole (once the dummies are included). For the nominal variables, prices
and wages, the coefficients are much higher, but this must be seen in
relation to the greater persistance of inflation than real growth.
The results for sub-periods indicate considerable variation in
performance in different time periods. The middle sample of 1976-83 is
closely related to that used in Holden and Peel (1985) and reveals
little evidence of inefficiency, with the null being rejected for the
real disposable income forecast alone. The evidence of inefficiency
appears greatest in the final sub-sample over the period from 1983Q4
onwards, with the null hypothesis of efficiency being accepted only for
GDP, investment, earnings and employment. In the cases of GDP and
exports there is no correlation between forecast and outturn over that
period at all; for domestic demand and for imports the correlation is
relatively high, but the forecasts systematically underpredict the
variation in the outturn. Clearly forecasting performance was not as
good in the last sub-period as it had been previously. But the question
remains whether it was significantly less successful, or whether this is
just an example of the variation in performance which is to be expected
from time to time.
Equivalent regressions using (3) and (3a) are reported in Pain and
Britton (1992). The results largely confirm the findings in Table 1,
with many of the inefficient forecasts also being biased. An additional
point of interest is the change in the direction of bias in the GDP
forecasts in the various subsamples, with a consistent underprediction
of growth in the final period. This is also reflected in the domestic
demand errors and appears to largely stem from the investment forecasts.
For consumers' expenditure there is evidence of significant
underprediction throughout the entire sample. This is also true of the
disposable income forecasts.
Tests for a structural break
The question as to whether performance could be said to have
changed significantly in the final sub-sample was addressed by using (4)
with D equal to 1 from 1983Q4 onwards. The resulting regressions are
given in Table 2, along with Wald tests for a structural break. (Dummy
variables were included, but are not reported.)
The test statistics are best interpreted as asking whether any
inefficiencies in the forecasts from 1983Q4 onwards differ from those in
earlier forecasts. Somewhat surprisingly there only appears to be
evidence of structural breaks in the relationships for investment and
disposable income when the latest outturns are used, although the
statistics for GDP, domestic demand and inflation are significant at the
10 per cent level. The most likely explanation is that many coefficients
are poorly determined in the final subsample and therefore consistent
with a number of different hypotheses. If the initial outturns are used,
the null is rejected for four variables, GDP, inflation, employment and
disposable income.
Overall, the evidence presented thus far is suggestive of some
deterioration in forecast performance since the mid-1980s, although the
experience of forecasting different variables is by no means uniform.
Indeed for some variables such as investment it is not possible to
reject the null that the most recent forecasts are collectively unbiased
and efficient, even though the relationship between the forecast and the
outcome appears to have changed significantly.
Was the information set used efficiently?
The absence of bias and inefficiency in (2) and (3) is only a
necessary condition for rationality. A further requirement is that
forecasts should fully reflect the information in the past history of
the variable concerned as well as previous forecast errors. Figlewski
and Wachtel (1981) test for efficiency using a regression of the
forecast error (defined using the first outturn denoted by i) on the
most recent forecast error perceived at the time of the forecast:
[A.sub.t] - [F.sub.t-i.t] = [alpha] + [BETA] ([A.sub.t-i-1] -
[F.sub.t-21-1.t-i]) (5)
If the forecast is efficient and takes into account any lessons
learnt from the past then it should be possible to impose [alpha] =
[BETA] = O.
An alternative approach is to use an equation of the form
[A.sub.t] = [alpha] + [BETA.sub.t-i.F.sub.t] + [gamma]
[A.sub.t-i-1] + [epsilon.sub.t] (6)
This equation allows us to compare the information in the NIESR
forecasts with that from a simple time series model that projects growth
at the last observed rate at the time of the forecast. The t-statistic
on [BETA] provides a test of whether the forecast itself adds any
relevant information (Fair and Shiller, 1990). The t-statistic on
[gamma] indicates whether the forecast has omitted any relevant
information (allowing for possible bias).
The relationship between the forecast errors in (5) is reported in
Table 3. Over the period as a whole there appears to be relatively
little relationship between the errors, with the null only being
rejected for investment, employment and disposable income. For
investment, the departure from rationality largely stems from a failure
to make full use of the information contained in past errors during the
second sub-sample.
[TABULAR DATA OMITTED]
As with the previous tests, the evidence of inefficiency is
greatest in the final sub-sample, with the null being rejected for
consumption, inflation, earnings, employment and disposable income. In
most cases this is associated with a significant intercept term, which
simply confirms the bias found in some earlier tests. In the cases of
inflation and earnings (and also GDP) the coefficient on the latest
error terms is significant. This suggests that forecasters could have
done better over this period if they had adjusted their initial
projections to take account of the errors observed in previous
forecasts. In an informal way forecasters always try to learn from their
mistakes. The results in Table 3 suggest that there was considerable
scope for doing so during the 1980s.
The results from the forecast encompassing test (6) augmented by
dummy variables) are reported in Table 4. Over the sample as a whole,
all the forecasts contain significant information, even though some are
inefficient. The domestic demand, exports and earnings regressions are
the only ones in which the lagged own growth rates are close to being
significant at conventional levels. The corollary to this is that there
is relatively little in the lagged information set of the variables
alone that can account for the forecast errors.
[TABULAR DATA OMITTED]
Again the results differ across the sub-samples, with the
investment forecasts in the first period and the import forecasts in
both the first and second periods containing little useful information.
In the most recent period this is true of the GDP, export and inflation
forecasts. The problems with GDP may arise from exports since the
domestic demand forecast seems to contain more useful information. The
results for earnings are also of interest, since in both the first and
third sub-samples the forecast and the lagged growth rate are
significant, suggesting a greater degree of inertia in wages than the
forecasts allowed for.
The relationship between data revisions and forecast
errors
This final section addresses the hypothesis that the observed
forecast errors arise from the false signals that forecasters have
received about the position of the economic cycle as a result of errors
in the National Accounts. One common explanation for recent forecast
errors is that there has been a deterioration in the quality of official
statistics, see for example Hibberd (1990). This concern mainly relates
to the apparent failure of initial estimates to fully capture the extent
of the variation in the growth rate rather than its direction of
movement. Underlying the problem of data revisions is the feeling that
if forecasters knew where the economy was they would be able to provide
a better picture of where it was going. This argument can be assessed in
two ways. First, it is possible to examine individual forecasts to see
how important data errors were. One example of such a study is Blake and
Pain (1991). Alternatively, it is possible to test the importance of
data revisions across the cycle as a whole. This provides some
indication of whether data revisions have a consistent adverse effect on
forecasts. For example, whilst data errors may have contributed to
errors made in 1987/88, it is less clear how they influenced the failure
to spot the recession in 1990/1.
In assessing the importance of data revisions it is very important
to specify the period to which forecasts relate. A forecast of output
growth between year i- 1 and year i made in year i will necessarily as a
matter of arithmetic depend heavily on early estimates of the level of
output in year i-1. If the provisional estimates are incorrect then the
growth forecast could well prove inaccurate. Alternatively if we are
concerned only with forecasts of growth over periods which are genuinely
in the future the effect of data revisions will be ambiguous. If the
estimates of activity in the pre-forecast period were to be revised up,
this could lead forecasters either to raise or lower their forecasts for
growth in the future depending on the model and methods they are using.
The general relationship between the latest data for growth and the
initial estimate(7) is typified by the domestic demand figures in Chart
11, with the initial estimates providing a reasonable estimate of the
true growth rate and capturing a number of turning points in the
respective series. Whilst there was a period between 1986-88 when growth
in domestic demand was continually underestimated, the errors were not
noticeably larger than in earlier periods, with a string of similar
errors being made in the early 1970s at the time of the |Barber
boom'.(8)
The results from a test of whether there is a significant
relationship between forecast errors and data revisions are reported in
Pain and Britton (1992). We find significant effects for both the trade
volumes and for real disposable income in the full sample and domestic
demand and investment in the first two sub-samples, although (as yet)
there does not appear to be a firmly established relationship in the
final sub-sample. The null is rejected for consumption in the final
sample, but this largely reflects the consistently positive forecast
error.
Overall the results are mixed. This suggests that there is
relatively little firm evidence that data inaccuracies have a
significant influence on forecast errors. Indeed the evidence we present
may simply reflect the known effects on growth rates associated with
rebasing. However this conclusion is necessarily tentative. In
particular it does not imply that preliminary data have not been a
source of error in individual forecasts.
Conclusions
The positive finding from this study is that over the period from
1969 to the present day the Institute forecasts are generally unbiased
and weakly efficient and clearly contain information significant for
prediction. Previous studies had suggested that this was true of
forecasts up till 1980, but we did not know beforehand that the results
would hold up when the forecasts made over the last decade were
included. On this basis it is still possible to conclude that economic
forecasting is a worthwhile activity, although the evidence for this is
not as strong as it was ten years ago and the experience of forecasting
different variables is far from uniform.
NOTES
(1) In the case of the Institute, the forecast is typically begun at
the end of the first month in the quarter, with the Review published at
the end of the second month. (2) Thus for 1968Q1 we take the forecast
for growth between 1967Q4 and 1968Q4. (3) It is also possible to correct
for possible heteroskedasticity in the estimation of the standard errors
of the coefficients. We have not done this at present. Evidence in
Mishkin (1990) suggests that such corrections, in contrast to the serial
correlation ones, do not help to produce better statistical inference.
(4) This may simply indicate that the forecasts are optimal, see Mincer
and Zarnowitz (1969). (5) We interpret evidence of inefficiency as a
departure from rationality, although, as Zellner (1986) illustrates, if
loss functions are asymmetric then biased predictors may be optimal. (6)
The reported t-statistics indicate whether [alpha] and [BETA] are
significantly different from zero. (7) The initial estimate of annual
growth up to a particular quarter consists of the growth rate perceived
at the time of the forecast undertaken in the following quarter. It thus
consists of 3 quarters worth of official data plus an estimate made by
the forecaster. (8) This is less true of some of the components of
expenditure. For example, the initial investment estimates for 1988
underpredicted the true growth rate by as much as 8 to 9 per cent.
REFERENCES
Blake A.P. and N.C. Pain (1991), |Data adjustment and forecast
performance', National Institute Economic Review, 135, 66-78. Brown
B.W. and S. Maital (1981), |What do economists know? An empirical study
of experts' expectations', Econometrica, 49, 491-504. Fair
R.C. and R.J. Shiller (1990), |Comparing information in forecasts from
econometric models', American Economic Review, 80,375-389.
Figlewski S. and P.Wachtel (1981), |The formation of inflationary
expectations', Review of Economics and statistics, LXIII, 1-10.
Hansen L.P. and R.J. Hodrick (1980), |Forward exchange rates as optimal
predictors of future spot rates: An econometric analysis', journal
of Political economy, 88, 829-853. Hibberd J. (1990), |Official
statistics in the late 1980s', Treasury Bulletin, Summer 1990,
2-13. Holden K. and D.A. Peel (1985), |An evaluation of quarterly
National Institute forecasts', Journal of Forecasting, 4,227-234.
Holden K. and D.A. Peel (1990), |On testing for unbiasedness and
efficiency of forecasts', The Manchester School, LVIII, 120-127.
Mincer J. and V. Zarnowitz (1969), |The evaluation of economic
forecasts', in Economic Forecasts and Expectations, ed. J. Mincer,
Columbia University Press. Mishkin F.S. (1990), 'Does correcting
for heteroscedasticity help?', Economics Letters, 34, 351-356.
NIESR (1984)' The National Institute's forecasts of inflation
1964-82', National Institute Economic Review 107,47-49. Pain, N.
and A. Britton, (1992), |The recent experience of economic forecasting
in Britain: some lessons from National Institute forecasts',
National Institute Discussion Paper (New Series) No. 20. Savage D.
(1983), |The assessment of the National Institute's forecasts of
GDP, 1959-82', National Institute Economic Review, 105,29-35.
Zellner A. (1986), |Biased predictors, rationality and the evaluation of
forecasts', Economics Letters, 21, 45-48.