首页    期刊浏览 2025年03月03日 星期一


  • 标题:Fiscal readjustments in the United States: a nonlinear time-series analysis.
  • 作者:Cipollini, Andrea ; Fattouh, Bassam ; Mouratidis, Kostas
  • 期刊名称:Economic Inquiry
  • 印刷版ISSN:0095-2583
  • 出版年度:2009
  • 期号:January
  • 语种:English
  • 出版社:Western Economic Association International
  • 摘要:The recent deterioration in U.S. budget deficit has raised serious concerns about the long-run sustainability of U.S. fiscal policy. In addressing this issue, many studies have examined whether U.S. fiscal policy respects the intertemporal government budget constraint. This constraint implies that Ponzi games in which the government rolls over its debt in full every period by borrowing to cover both principal and interest payments are ruled out as a viable option for government finances. The no-Ponzi game restriction, which is regarded as synonymous with sustainability, requires that today's government debt is matched by an excess of future primary surpluses over primary deficits in present value terms. This condition imposes testable restrictions on the time-series properties of key fiscal measures such as the stock of public debt, the budget deficit, and the long-run relationship between government expenditures and revenues.
  • 关键词:Budget deficits;Economic conditions;Fiscal policy;Time series analysis;Time-series analysis

Fiscal readjustments in the United States: a nonlinear time-series analysis.

Cipollini, Andrea ; Fattouh, Bassam ; Mouratidis, Kostas 等


The recent deterioration in U.S. budget deficit has raised serious concerns about the long-run sustainability of U.S. fiscal policy. In addressing this issue, many studies have examined whether U.S. fiscal policy respects the intertemporal government budget constraint. This constraint implies that Ponzi games in which the government rolls over its debt in full every period by borrowing to cover both principal and interest payments are ruled out as a viable option for government finances. The no-Ponzi game restriction, which is regarded as synonymous with sustainability, requires that today's government debt is matched by an excess of future primary surpluses over primary deficits in present value terms. This condition imposes testable restrictions on the time-series properties of key fiscal measures such as the stock of public debt, the budget deficit, and the long-run relationship between government expenditures and revenues.

In a seminal article, Hamilton and Flavin (1986) suggest that a sufficient condition for the intertemporal budget constraint to hold is for the deficit inclusive of interest payments to be stationary. Wilcox (1989) extends the work of Hamilton and Flavin by allowing stochastic interest rates and nonstationarity in the noninterest surplus. He shows that when the sustainability condition holds, the present value of the stock of public debt should be stationary and has an unconditional mean of zero. Trehan and Walsh (1988) generalize the Hamilton and Flavin result and show that if debt and deficits are integrated of order 1, and if interest rates are constant, then a necessary and sufficient condition for sustainability is that debt and primary balances (net-of-interest deficits) are cointegrated. Other studies examine the time-series properties of government spending and revenues. For instance, Hakkio and Rush (1991) show that a necessary condition for intertemporal budget constraint is the existence of cointegration between government expenditure (inclusive of interest payments) and government revenues. Quintos (1995) expands on Hakkio and Rush (1991) and introduces the concept of strong sustainability condition, which implies that the undiscounted public debt is finite in the long run.

More recent work has emphasized the importance of nonlinearity in the U.S. fiscal policy. This nonlinearity may arise if we expect fiscal authorities to react differently to whether the deficit has reached a certain threshold deemed to be unacceptable or unsustainable. Bertola and Drazen (1993) develop a framework that allows for trigger points in the process of fiscal adjustment such that significant adjustments in budget deficit may take place only when the ratio of deficit to output reaches a certain threshold. This may reflect the existence of political constraints that block deficit cuts, which are relaxed only when the deficit reaches a sufficiently high level deemed to be unsustainable (Alesina and Drazen 1991; Bertola and Drazen 1993).

Recent studies have found strong evidence of nonlinearity in U.S. fiscal policy. Using an exponential smooth transition autoregressive model and long-span data set starting from 1916, Sarno (2001) provides evidence of nonlinear mean reversion in the U.S. debt-to-gross domestic product (GDP) ratio. By using a threshold autoregressive model, Arestis, Cipollini, and Fattouh (2004) provide evidence of threshold effects such that policymakers will intervene to reduce per capita deficit only when it reaches a certain threshold.

In line with the above studies, we provide new evidence of strong nonlinearity in the U.S. fiscal policy. We contribute to the existing literature by extending the analysis of U.S. fiscal adjustment from a single-equation setting to a multivariate one using a nonlinear vector error correction model (VECM). This extension adds value both in terms of our economic understanding of the fiscal adjustment process in the United States and assessing the forecasting power of the model. First, using a multivariate threshold cointegration model, we are able to identify whether the government's solvency constraint in the United States is achieved through tax increases, spending cuts, or a combination of both. The issue of which specific item of the budget ensures fiscal readjustments has received considerable attention among U.S. policymakers and has been recently the focus of much heated debate. For instance, Rubin, Orszag, and Sinai (2004) argue that "balancing the budget for the longer term will require a combination of expenditure restraint and revenue increases." The authors believe that "the single most important act Congress and the Administration could take at this point to rein the budget over the next decade would be to re-establish the budget rules that existed in the 1990s. These put caps on discretionary spending and required that reductions in taxes or increases in mandatory spending be paid for with other tax increases or spending cuts." A study by the Congressional Budget Office (2003) also cautioned that "economic growth alone is unlikely to bring the nation's long term fiscal position into balance."

The contribution of the academic literature to this debate has been very limited. Alesina and Perotti (1995) find evidence that for fiscal adjustment to be permanent and effective, the focus must be on the level of expenditure rather than on taxation. (1) They argue that tax increases ease fiscal problems only temporarily. Temporary tax increases may also be very difficult to reverse, and as such, tax-driven deficit cuts may induce high tax ratios. Furthermore, raising taxes is unpopular, and there are doubts whether such a strategy can in fact increase government revenues. Bohn (1991) and Crowder (1997) rely on the government's intertemporal solvency condition to analyze the performance of fiscal stabilization plans over a long-term data span. Specifically, the budget item series showing most of the error-correcting dynamics is the one bearing most of the fiscal readjustment burden. Crowder (1997) shows that the large U.S. deficits in the 1980s and early 1990s have been primarily caused by increases in government spending rather than falls in tax revenues. Thus, in order to restore the intertemporal budget constraint, the bulk of fiscal readjustment should occur through government spending cuts rather than increases in tax revenues. Bohn (1991) shows that regardless of the shock that caused the high budget deficit, historically these deficits have been corrected by combination of both spending cuts and tax increases. Auerbach (2000) finds that both components of U.S. fiscal policy have been responsive to fluctuations in the deficit although the response from government spending has been much more important.

Our results reveal the following important findings. They provide support for the existence of trigger points in U.S. fiscal policy. Specifically, we find strong evidence of nonlinearity in the fiscal process where adjustment occurs only when the real deficit per capita reaches a certain threshold. Below this threshold, there seem to be no significant error correction effects, which may suggest that policymakers become sensitive to large deficits only when the deficit reaches the very "high" level deemed to be unacceptable or unsustainable. More importantly, we find that government expenditure shows the strongest error-correcting dynamics, and hence, the bulk of fiscal adjustment seems to occur through spending cuts rather than increases in tax revenue.

In addition to gaining better understanding of the U.S. fiscal adjustment process, we evaluate the out-of-sample density forecast and probability forecast performance of the estimated model. Our results highlight an additional advantage from generalizing the model from a single-equation to a multivariate setting. Specifically, the results of out-of-sample density forecast and probability forecasts suggest that there is an improvement in forecast performance when we move from a univariate autoregressive (AR) model to a multivariate model. We also compare the out-of-sample forecast performance of the linear and threshold models. In a recent survey, Granger (2001) concludes that a major weakness of the literature on nonlinear models is that little is known about the out-of-sample forecasting properties of different nonlinear models or their out-of-sample forecast performance with those corresponding to linear models. The empirical findings suggest that, although the threshold VECM has a slightly better probability forecast performance than the linear VECM, the density forecast performance of both the linear and the nonlinear VECMs is similar for the long horizon (e.g., 2 yr ahead), and thus, we cannot recommend the use of the threshold VECM over simple linear models for forecasting purposes. Similar results have been found recently in the context of the exchange market (see for instance, Rapach and Wohar 2006). This suggests that although nonlinear models are useful to gain a better understanding of the U.S. fiscal policy, they do not necessarily provide more reliable forecasts.

This paper is organized as follows. Section II describes the empirical methodology, while Section III presents the empirical results. Section IV summarizes and concludes.


A. Threshold Cointegration

A VECM fitted to both G, the real government expenditure per capita, and R, the real government revenue per capita, is used to test whether there is any evidence of public finance sustainability and to test which of the two fiscal series carries the burden of fiscal readjustment (if any). Many empirical studies have concentrated on estimating the following linear VECM (where, for simplicity, we fix the VECM lag order to 1):


where [mu] is a two-dimensional vector of intercepts, [w.sub.t-1] = [G.sub.t-1] - [beta][R.sub.t-1], [alpha] is a two-dimensional vector of speed of adjustment coefficients, and [u.sub.t] is the error term vector. According to Quintos (1995), the deficit is "strongly" sustainable if the I(1) processes [R.sub.t] and [G.sub.t] are cointegrated and [beta] = 1, while it is "weakly" sustainable if [R.sub.t] and [G.sub.t] are cointegrated and 0 < [beta] < 1. Weak sustainability implies that the government constraint holds, but the undiscounted debt process is exploding at a rate that is less than the growth rate of the economy. Although this case is consistent with sustainability, it is inconsistent with the ability of the government to market its debt in the long run. Thus, in this paper, we will only test for the "strong" sustainability condition and set [beta] = 1. (2) By setting [beta] = 1, the error correction term becomes the real deficit per capita.

As argued above, Equation (1) may not be the most appropriate means to characterize the fiscal adjustment process, for there may exist trigger points in the process of fiscal adjustment. Hence, in this study, we focus on the following threshold VECM:


The model given by Equation (2) allows us to test whether there are significant asymmetries in the adjustment process of per capita government revenues and per capita government expenditure to the long-run equilibrium level depending on the level of deficit per capita, [w.sub.t-1], given by [G.sub.t] - [R.sub.t]. In particular, if the real deficit per capita exceeds the trigger point [gamma], then there is a switch in the speed of adjustment coefficients from [[alpha].sub.1] to [[alpha].sub.2], as well for the other short-run dynamics parameters. Hansen and Seo (2002) suggest estimating the model given by Equation (2) through maximum likelihood under the assumption that the errors [u.sub.t] are iid Gaussian. The Gaussian likelihood is

(3) [L.sub.n] = - n/2 log[absolute value of [summation]] - 1/2 [n.summation over (t=1)] [u'.sub.t] [[summation].sup.- 1] [u.sub.t],



with the indicator function [d.sub.1t]([gamma]) taking value 1 if the deficit is below the trigger point [gamma] and 0 otherwise. Furthermore, [d.sub.2t]([gamma]) is equal to (1-[d.sub.1t]([gamma])). In order to detect nonlinearity, Hansen and Seo (2002) use a Lagrange multiplier (LM) statistics to test [H.sub.0] (linear cointegration) versus [H.sub.1] (threshold cointegration). If the cointegrating vector is known and equal to [[beta].sub.0] (in our study, it is fixed to unity), then the LM test is given by


Given that the asymptotic critical values of the distribution of the test statistics cannot in general be tabulated, bootstrapped p values are computed using both a fixed regressor and a parametric bootstrap method, as described by Hansen and Seo (2002).

B. Out-of-Sample Density Forecasts

To further motivate the use of threshold VECM, we explore whether our proposed model is superior to both the univariate model and the linear model in terms of its out-of-sample forecast performance. Traditionally, evaluating the forecast accuracy of models has been based on point forecasts, often using the root mean square error (RMSE). The empirical evidente suggests that the forecasting ability of linear models frequently outperforms the nonlinear model on the basis of the RMSE criterion alone. (3) However, several studies have recently emphasized the importance of evaluating forecast performance on the basis of an estimate of the complete probability distribution of the possible future outcomes of the series (i.e., a density forecast) as opposed to point forecasting.

More specifically, only under certainty equivalence (e.g., policymakers with quadratic loss function and linear dynamics of the predicted variable), can the RMSE be used as a criterion for choosing an optimal forecast. (4)

If certainty equivalence does not hold, then it is important to focus not only on the first moments but also on the overall density of forecasts. The density forecasts are generated through stochastic simulation, and we give in the Appendix a detailed description of this method.

First, we produce the density forecasts for both changes in government spending and tax revenues using a univariate AR model. Then, we produce the marginal density forecasts for both changes in government spending and tax revenues, [DELTA]G and [DELTA]R, respectively. We also produce the conditional density of government spending changes and tax revenues changes, [DELTA]G/AR and [DELTA]R/ [DELTA]G, respectively. Finally, we produce the joint density of government spending changes and tax revenues changes, ([DELTA]G/AR) x [DELTA]R and ([DELTA]R/[DELTA]G) x [DELTA]G, respectively. We consider three different forecast horizons, h, equal to one, four, and eight quarters ahead. For the purpose of density forecast evaluation, in line with Clements and Smith (2000), for a given forecast horizon h, we calculate the probability integral transforms (PITs) of the actual realizations, [y.sub.t], of each fiscal series over the forecast evaluation period with respect to the model's forecast densities, given by [{[p.sub.t]([y.sub.t])}.sup.n.sub.t=1]. Therefore, we evaluate the PIT:


for t = 1, ..., n. When the model's forecast density corresponds to the true predictive density, the sequence [z.sub.t] is iid, U(0, 1). In line with Diebold, Gunther, and Tay (1998) and Clements and Smith (2000), we use informal data analysis to test whether PIT is lid, U(0, 1). Therefore, evaluating the accuracy of density predictions consists of assessing uniformity using the Probability-Probability (PP) plots. (5) Specifically, we plot the empirical distribution function of PIT against the 45[degrees] line, with critical values defining the confidence intervals obtained from Miller (1956). Then, in order to assess whether the PIT series is iid, we use the Lagrange multiplier test for the null of serial independence of (PIT - P[[bar.I]T).sup.j] for integer j up to order 3, with PIT being the mean PIT series. (6)

Furthermore, we consider the Berkowitz (2001) approach to evaluate the accuracy of density forecasts. (7) Specifically, we take the inverse of the Gaussian cumulative distribution function with respect to each component of the sequence PIT, which gives [PIT.sup.*]. Under the null of iid U(0, 1) for the sequence PIT, the series [PIT.sup.*] becomes a standard Gaussian random variable. In order to test for normality in [PIT.sup.*], Berkowitz (2001) suggested a likelihood ratio test for the joint null of normality and lid in [PIT.sup.*]. The test statistic is [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is the value of the maximum likelihood function of an AR(1) model fitted to [PIT.sup.*], where [??] and [??] are the estimated intercept and autoregressive coefficient, respectively, and [??] is the estimated standard deviation for the residuals of the AR(1). Under the null, the [LR.sub.B] has a [X.sup.2.sub.3] distribution.

The stochastic simulation is used both to produce forecasts under any type of scenario (e.g., the density forecast) and to generate the forecasts for particular types of scenarios. Specifically, we are interested in generating the probability forecasts for two types of events (for probability forecast analysis, see Clements 2005; Galvao 2006). The first one is defined by negative changes in government spending and the second one by positive changes in tax revenues. Using the simulation method described in the Appendix, we produce 1,000 h-step-ahead forecasts for government spending changes (conditional on the available information set), and we count how many of these forecasts are negative. This number divided by 1,000 gives the probability forecast for the government spending series. The same methodology is applied to generate the probability forecast for the tax revenue series. We repeat this exercise by increasing the overall sample by one additional observation till we reach the end of the forecast evaluation period. We use the following indicators of probability forecast accuracy (Galvao 2006):

QPS = 1/T [T.summation over (t=1)] 2[([P.sub.t] - [R.sub.t]).sup.2]

LPS = 1/T [T.summation over (t=1)] [absolute value of [(1 - [R.sub.t]) ln (1 - [P.sub.t]) + [R.sub.t] ln([P.sub.t])],

where [P.sub.t] and [R.sub.t] are the probability forecast and the actual realization of the variable one is interested in predicting. Finally, the QPS score ranges from 0 to 2, with 0 being perfect accuracy. The second one ranges from 0 to [infinity]. LPS and QPS imply different loss functions with large mistakes more heavily penalized under LPS.


A. Data and Data Sources

The data set used in this study comprises quarterly observations over the first quarter of 1947 to the last quarter of 2004. We examine the dynamics of real per capita expenditure and real per capita revenues, and hence, we only focus on the strong sustainability condition, as Quintos (1995) has described. We first collect data on the nominal current federal expenditure (inclusive of interest payments) and current federal revenues (seasonally adjusted). We deflate both series by implicit GDP deflator to obtain real values. The series are then deflated by population to obtain real per capita government expenditure and real per capita government revenues. All the data have been obtained from the Federal Reserve Economic Data database available from the Federal Reserve Bank of St. Louis.

B. In-Sample Forecasting Analysis

The augmented Dickey-Fuller (ADF) and the Philips-Perron tests for the null of unit root (see the first two columns of Table 1) suggest that we cannot reject the null hypothesis of nonstationarity in the levels of real per capita government expenditure and real per capita government revenue. These findings are also confirmed by the tests developed by Ng and Perron (2001) under GLS detrending using the modified AIC to select the optimal lag order. Specifically, as for the tax revenue series, the [MZ.sup.GLS.sub.a] and [ADF.sup.GLS] tests suggest that we cannot reject the null of unit root at any significance level. As for the tax revenue series, according to the [MZ.sup.GLS.sub.a], we cannot reject the null of unit root, whereas using the [ADF.sup.GLS], we cannot reject the null of unit root at 1% significance level.

Before carrying on with cointegration analysis, we select the VECM lag length. The results are reported in Table 2 Panel A for the linear VECM and Table 2 Panel B for the threshold VECM. As can be seen from this table, both the AIC and BIC statistics pick a lag of 1. This holds for both the linear and the threshold VECMs. (8)

We next test for the existence of threshold effects in the VECM using the SupLM statistic. As can be seen from Table 3, the [SupLM.sup.0] statistic suggests a strong presence of threshold effects where the null hypothesis of no threshold can be rejected at the 5% level. The Wald tests also point in the same direction. The null hypothesis that the error correction coefficients and dynamic coefficients are the same in both regimes can be rejected at 5% and 1% levels, respectively.

The parameter estimates were calculated by minimization of log [absolute value of [SIGMA]([??])] over 300 grid points for the parameter [gamma]. The estimates are reported in Table 4. The estimated threshold is $8.859 per capita, which implies that the first regime occurs when the real deficit per capita is less than or equal to $8.859. This regime contains 82% of the sample observations. The second regime occurs when the real deficit per capita is above the threshold of $8.859. Following Hansen and Seo (2002), we label the first regime as the "typical" regime and the second regime as the "unusual" regime. The results in Table 4 show that the typical regime has no significant error correction effects, with the coefficients on the lagged error correction terms in both equations [DELTA][R.sub.t] and [DELTA][G.sub.t] insignificant at the conventional levels. In contrast, error correction effects occur only in the unusual regime--that is, when the real deficit per capita has risen above the estimated threshold. Interestingly, the results indicate that fiscal readjustment occurs through spending cuts rather than with increases in tax revenue: while the estimated coefficient on the error correction term in the government expenditure equation is large and highly significant, the estimated coefficient on the error correction term in the revenue per capita equation is quite small and not significant at the conventional levels.

In Figure 1, we plot the deviations of the real deficit per capita from the estimated threshold point estimate over the sample period. Note that in this figure, positive values identify the unusual regime, whereas the negative values identify the typical regime. Figure 1 clearly shows that there have been four major shifts from the typical to the unusual regime in the real deficit per capita dynamics. First, a major shift occurred in February 1975 during the peak of the 1973 oil crisis, which plunged the U.S. economy into a deep recession. A second major shift occurred in March 1981. This shift, which occurred during the Reagan presidency, corresponds to the effects of the legislation passed by the Congress aimed at cutting personal income taxes over the next 3 yr (the 1981 Economic Recovery Tax Act). Since the tax cuts were not met by equal cuts in government spending, the federal budget went into large deficit and remained so for a considerable period of time. It is only in March 1987 that we witness a regime shift back toward the typical regime. This switch reflects in part the intensive political and economic debate in the Congress and in the media and the efforts made by fiscal authorities to reduce the large and growing budget deficit. These efforts were manifested in the Tax Reform Act of 1986 and the Balanced Budget and Emergency Deficit Control Act, which called for progressive reduction in the deficit and the achievement of a balanced budget by the early 1990s (Ippolito 1990).


Despite the efforts made to balance the budget, another major shift (the third one) from the typical to the unusual regime occurred in February 1991. This switch occurred during the Senior Bush presidency and corresponds closely to the recession that plagued the U.S. economy at the beginning of Senior Bush's term and later to the budgetary requirements of the Gulf War. In February 1994, there was a switch to the typical regime, which lasted for the rest of the 1990s. This coincided with President Clinton's move to the White House and the importance he attached to balancing the budget in his economic policy. Finally, in April 2002, there was a switch from the typical to the unusual regime. This switch corresponds to the current President Bush's administration with its emphasis on cutting taxes and boosting defense and security outlays, which has caused large budget deficits.

C. Out-of-Sample Forecasting Analysis

We compare the out-of-sample forecast performance of the linear model and the threshold cointegration model. We leave out the last 64 observations of the sample for density forecast evaluation. More specifically, the forecast evaluation period for the one-quarter-ahead predictions starts from the first quarter of 1989, which corresponds to the beginning of the George Bush senior administration, and ends in the last quarter of 2004.

In order to produce out-of-sample forecasts, we estimate recursively three different model specifications (univariate AR, linear VECM, and nonlinear VECM). We concentrate on one-quarter-, 1-yr-, and 2-yr-ahead predictions. As for the one-quarter-ahead projections, we consider, initially, the sample that ends in the last quarter of 1988, and then, we increase the sample by one observation each time period till we reach a sample period that ends in the third quarter of 2004. In order to produce four-quarters-ahead predictions, we consider, initially, the sample that ends in the first quarter of 1998, and then, we increase the sample by one observation each time period till we reach a sample period that ends in the last quarter of 2003. Finally, to produce eight-quarters-ahead predictions, we consider, initially, the sample that ends in the first quarter of 1987, and then, we increase the sample by one observation each time period till we reach a sample period that ends in the last quarter of 2002.

The out-of-sample point forecast evaluation in Table 5 shows that the evidence is inconclusive: the RMSE values corresponding to the point forecast of the government spending series obtained from the different models are close to each other at the different forecast horizons. Moreover, although the nonlinear VECM is the worst in the one-quarter-ahead point prediction of tax revenues, the different models have a similar performance for the 1-and 2-yr forecast horizons.

The results from Table 6 suggest that none of the models proposed is capable of providing a good density forecast for the tax revenue series. Specifically, although the PP plots for the PIT sequence (see the right-hand-side panel of Figures 2-8) show the 45[degrees] line inside the confidence interval bands for all the model specifications and for most of the forecast horizons (with the exception of the 1-yr-ahead density forecast from the univariate AR model--see Figure 2), the Ljung-Box test suggests evidence of serial correlation in the first and third moments of the PIT sequence. (9) As for government spending, there is an improvement in density prediction performance when we move from the univariate AR model to the multivariate model and as we considera forecast horizon longer than one quarter. In particular, even though there is no evidence of serial correlation in the first, second, and third moments of the PIT sequence corresponding to AR density forecasts (Table 6), the corresponding PP plots show the 45[degrees] line outside the confidence interval bands (see the left-hand-side panel of Figure 2) for density prediction over 1 and 2 yr, respectively.

As for the multivariate models, the density forecast performance for the linear and threshold VECM specifications is similar for the long horizon (e.g., 2 yr ahead). The Ljung-Box test suggests absence of serial correlation in the first, second, and third moments of the PIT sequence for the marginal, conditional, and joint density forecasts of government spending produced by both the linear and the nonlinear VECMs and for any forecast horizon (Table 6). However, the PP plots for the PIT associated with the threshold VECM marginal, conditional, and joint density forecasts of government spending have the 45[degrees] line inside the confidence interval bands only when we consider an eight-step-ahead forecast horizon (Figures 6-8). From Figures 3-5, we can observe that the PP plots for the PIT associated with the linear VECM marginal, conditional, and joint density forecasts of government spending have the 45[degrees] line inside the confidence interval bands for any forecast horizon.

Using the Berkowitz (2001) test, from Table 7, we can observe that the strongest rejection of the null hypothesis of normality and iid for the inverse of the cumulative (standardized) Gaussian distribution with respect to the PIT sequence applies to the AR and linear VECM. When we use a threshold VECM, there is a mild nonrejection of the null hypothesis if we turn our focus on the conditional and joint density (one-step-ahead) forecasts of government spending changes and on the marginal and joint density (eight-step-ahead) forecasts of government spending changes.








Finally, the probability forecast exercise confirms the results obtained from the density forecast evaluation. As mentioned in the Section II above, we are interested in evaluating the model forecast performance regarding events that can be associated with fiscal readjustments, and these are either positive changes in tax revenues or negative changes in government spending. Therefore, as also noted above, we need to compute probability forecasts and evaluate them in terms of QPS and LPS scores. As for government spending (Table 8 Panel A), the best performer for any type of prediction horizon is the nonlinear VECM where QPS and LPS scores are considerably lower than the corresponding ones for the AR model. As for the tax revenues (Table 8 Panel B), the worst probability forecast performance is the one associated with the nonlinear VECM for the one-quarter-ahead probability forecast. There are gains from moving a univariate AR modeling framework to a multivariate model if the prediction horizon is either 1 or 2 yr ahead, and the nonlinear VECM is the best performer (in terms of QPS and LPS scores) if the forecast horizon is 2 yr ahead.


In this paper, we investigate empirically the U.S. government's intertemporal solvency condition and assess whether the government's solvency constraint has been achieved mainly through tax increases, spending cuts, or a combination of both. Using a threshold vector error correction estimation procedure, we find evidence that government authorities would intervene only when the deficit per capita had reached a certain threshold. Our results show that the bulk of fiscal adjustment occurs through spending cuts rather than increases in tax revenue.

In terms of forecasting, the picture is mixed. By evaluating the out-of-sample density forecast performance of the estimated model, we show that there is an improvement in forecast performance when we move from a univariate AR model specification to a multivariate model. However, we find that the forecasting performance of both linear and nonlinear VECMs is similar for long horizon (e.g., 2 yr ahead), and thus, we cannot recommend the use of the threshold VECM over simple linear models for forecasting purposes.

This suggests that our proposed model could be improved upon and should be evaluated in comparison with not only alternative multivariate nonlinear models but also multivariate linear models with structural breaks. One might also consider a time trend or an indicator of the U.S. business cycle as an additional threshold variable (beyond the government deficit) in the nonlinear multivariate model. Recently, Galvao (2006) has found that the U.S. term spread performs well in predicting U.S. industrial production using a threshold VAR with both a time trend and the term spread as threshold variables.

These extensions can prove very fruitful avenues for future research.


ADF: Augmented Dickey-Fuller

AIC: Akaike Information Criterion

AR: Autoregressive

BIC: Bayesian Information Criterion

GDP: Gross Domestic Product

GLS: Generalized Least Squares

iid: Independent and Identically Distributed

LM: Lagrange Multiplier

LPS: Logarithmic Probability Score

PIT: Probability Integral Transform

PP: Probability-Probability

QPS: Quadratic Probability Score

RMSE: Root Mean Square Error

VECM: Vector Error Correction Model


The stochastic simulation method explained in Galvao (2006) is used to produce the joint density forecasts. Define [x.sub.t], as the vector of endogenous variables {[DELTA]G, [DELTA]R}' and [X.sup.t] = {[x.sub.t-1], [x.sub.t-2], ..., [x.sub.1]} as the history at time t. Given an estimate of A from the linear VECM [x.sub.t] = f([X.sup.t-1]; A) + [u.sub.t] and of the sample covariance matrix of residuals, [??], a trial sequence of forecasts [x.sub.t+1], [x.sub.t+2], [x.sub.t+3], ..., [x.sub.t+h] is built as follows. A random vector [u.sub.t+1] is drawn from the distribution u ~ N(0, [??]), and it is used to calculate [??].sub.t+1], given [X.sup.t] and [??]. Then, [[??].sub.t+1] is added to "history" to form [[??].sup.t+1]. This procedure is continued until the sequence of forecast is complete {[x.sub.t+1], [x.sub.t+2], [x.sub.t+3], ..., [x.sub.t+h]}. This sequence of forecast can be called [S.sub.1], and the same trial is repeated to obtain a set of 1,000 forecast sequences. In the case of threshold, models, the forecasting model can be also written as [x.sup.j.sub.t] = [f.sup.k](X.sup.t-1];[[beta].sup.j] + [[epsilon].sup.j.sub.t], where j = 1, 2, to indicate the two regimes. Therefore, given [[??].sup.1] and [[??].sup.2], which are the estimated covariances for the two regimes, in order to obtain the forecast sequence we proceed as follows. Given the one-step-ahead point forecast, either the vector [u.sup.1.sub.t+h] is drawn from [u.sup.1 ~ N(0, [[??].sup.1]) or the vector [u.sup.2.sub.t+h] is drawn from [u.sup.2] ~ N(0, [[??].sup.2]) depending on whether the deficit is below or above the estimated threshold. The realizations for this vector of innovations are then used to calculate [[??].sub.t+1], given [X.sup.t] and [??]. Then [[??].sub.t+1] is added to history to form [[??].sup.t+1]. This procedure is continued until the sequence of forecast is complete {[x.sub.t+1], [x.sub.t+2], [x.sub.t+3], ..., [x.sub.t+h]. This sequence of forecast can be called [S.sub.m] and the same trial is repeated to obtain a set of 1,000 forecast sequences. For each sequence of forecasts [S.sub.m] (with m describing the mth scenario), we pick the last vector of observations, e.g., [x.sub.t+h,]. The first component of this vector describes the joint model prediction for the (change in the) government spending series associated with scenario m, and the second component of this vector describes the joint model prediction for the (change in the) tax revenue series associated with scenario m.


The methodology to generate the sequence of forecast S (by picking the last observation in this sequence) is similar to the method described in Appendix Al. The only exception is to fix one of the two innovations to a specific value, and this gives the conditional density forecast. In particular, if we fix the innovation to tax revenues to the sample mean of this series, and if we let the other shock (e.g., the one affecting government spending) to get 1,000 realizations from Gaussian random draws, then we are able to generate the density forecast of government spending conditional on the sample mean value of tax revenues. Furthermore, if we fix the innovation to government spending to the sample mean of this series, and if we let the other shock (e.g., the one affecting tax revenues) to get 1,000 realizations from Gaussian random draws, then we are able to generate the density forecast of tax revenues spending conditional on the sample mean value of tax revenues.


The methodology to generate the sequence of forecast S (by picking the last observation in this sequence) is similar to the method described in Appendix Al. However, the simulation method involves calibration to the sample standard deviation of each series and not to the overall sample covariance matrix. Specifically, the only difference with the method described in Appendix Al is multiplying the different realization of an iid shock (using standardized Gaussian random draws) by the sample standard deviation of government spending, thereby obtaining the marginal density forecast of government. Finally, if we multiply the different realization of an iid shock (using standardized Gaussian random draws) by the sample standard deviation of tax revenues, we obtain the marginal density forecast of tax revenues.


Given the estimation of an AR(1) for each of the two series, the density forecasts at different horizon for one series is given by [x.sub.t+h] = [[??].sub.0]h + ([[??].sup.h.sub.1][x.sub.t] + [[alpha].sub.1.sup.h-1][u.sub.t+1] + ... + [u.sub.t+h]), where [[??].sub.0] and [[??].sub.1] are the estimated intercept and autoregressive coefficient of each series, respectively.


Alesina, A., and R. Drazen. "Why Are Stabilisation Delayed?" American Economic Review, 82, 1991, 1170-88.

Alesina, A., and R. Perotti. "Fiscal Expansion and Adjustment in OECD Countries." Economic Policy, 21, 1995, 206-47.

Arestis, P., A. Cipollini, and B. Fattouh. "Threshold Effects in the U.S. Budget Deficit." Economic Inquiry, 42, 2004, 214-22.

Auerbach, A. J. "Formation of Fiscal Policy: The Experience of the Past Twenty-Five Years." FRBNY Economic Policy Review, 6, 2000, 9-23.

Berkowitz, J. "Testing Density Forecasts, with Applications to Risk Management." Journal of Business and Economic Statistics, 19, 2001, 465-74.

Bertola, G., and A. Drazen. "Trigger Points and Budget Cuts: Explaining the Effects of Fiscal Austerity." American Economic Review, 83, 1993, 11-26.

Bohn, H. "Budget Balance through Revenue or Spending Adjustments?" Journal of Monetary Economics, 27, 1991, 333-59.

Christoffersen, P. F., and F. Diebold. "Optimal Prediction under Asymmetric Loss." Econometric Theory, 13, 1997, 808-17.

Clements, M. P. Evaluating Econometric Forecasts of Economic and Financial Variables. Basingstoke, United Kingdom: Palgrave Macmillan, 2005.

Clements, M. P., and J. Smith. "Evaluating the Forecast Densities of Linear and Nonlinear Models Applications to Output Growth and Unemployment." Journal of Forecasting, 19, 2000, 255-76.

Congressional Budget Office. "The Long-Term Budget Outlook." Washington, DC: CBO Publications Office, 2003.

Crowder, W. "The U.S. Federal Intertemporal Budget Constraint: Restoring Equilibrium through Increased Revenues or Decreased Spending." Manuscript, 1997.

Cunado, J. L., A. Gil-Alana, and F. Perez de Gracia. "Is the US Fiscal Deficit Sustainable? A Fractionally Integrated Approach." Journal of Economics and Business, 56, 2004, 501-26.

Diebold, F. X., T. A. Gunther, and A. S. Tay. "Evaluating Density Forecast." International Economic Review, 39, 1998, 863-83.

Diebold, F. X., and J. A. Nason. "Non-Parametric Exchange Rate Prediction." Journal of International Economics, 28, 1990, 315-32.

Galvao, A. B. C. "Structural Break Threshold VAR for Predicting US Recessions Using the Spread." Journal of Applied Econometrics, 21, 2006, 463-87.

Granger, C. W. J. "An Overview of Nonlinear Macroeconometric Empirical Models." Macroeconomic Dynamics, 5, 2001, 466-81.

Hakkio, C. S., and M. Rush. "Is the Budget Deficit 'Too Large'?" Economic Inquiry, 29, 1991, 429-45.

Hamilton, J. D., and M. A. Flavin. "On the Limitation of Government Borrowing: A Framework for Empirical Testing." American Economic Review, 76, 1986, 808-19.

Hansen, B., and B. Seo. "Testing for Two Regime Threshold Cointegration in Vector Error Correction Models." Journal of Econometrics, 110, 2002, 293-318.

Ippolito, D. S. Uncertain Legacies." Federal Budget Policy from Roosevelt through Reagan. Charlottesville, VA: University Press of Virginia, 1990.

Martin, G. M. "US Deficit Sustainability: A New Approach Based on Multiple Endogenous Breaks." Journal of Applied Econometrics, 15, 2000, 83-105.

Miller, L. H. "Table of Percentage Points of Kolmogorov Statistics." Journal of American Statistical Association, 51, 1956, 111-21.

Ng, S., and P. Perron. "Lag Length Selection and the Construction of Unit Root Tests with Good Size and Power." Econometrica, 69, 2001, 1519-54.

Quintos, C. E. "Sustainability of the Deficit Process with Structural Shifts." Journal of Business and Economic Statistics, 13, 1995, 409-17.

Rapach, D., and M. Wohar. "The Out-of-Sample Forecasting Performance of Nonlinear Models of Real Exchange Rate Behavior." International Journal of Forecasting, 22, 2006, 341-61.

Rubin, R. E., P. R. Orszag, and A. Sinai. "Sustained Budget Deficits: Longer-Run U.S. Economic Performance and the Risk of Financial and Fiscal Disarray." Paper presented at the AEA-NAEFA Joint Session, Allied Social Science Associations Annual Meeting, 2004.

Sarno, L. "The Behavior of US Public Debt: A Non-Linear Perspective." Economics Letters, 74, 2001, 119-25.

Sarno, L., and G. Valente. "Comparing the Accuracy of Density Forecasts from Competing Models." Journal of Forecasting, 23, 2004, 541-57.

Spanos, A. Probability Theory and Statistical Inference: Econometric Modeling with Observational Data. Cambridge: Cambridge University Press, 1999.

Trehan, B., and C. E. Walsh. "Common Trends, the Government Budget Constraint and Revenue Smoothing." Journal of Economic Dynamics and Control, 12, 1988, 425-44.

Wilcox, D. W. "The Sustainability of Government Deficits: Implications of the Present Value Borrowing Constraint." Journal of Money, Credit and Banking, 21, 1989, 291-306.

(1.) Alesina and Perotti (1995) use the long-run, cyclically adjusted primary deficit to identify periods of fiscal readjustment. Specifically, a very tight fiscal policy in year t occurs when the cyclically adjusted deficit decreases by more than 1.5% of GDP. A successful fiscal adjustment in year t occurs when a tight fiscal policy implemented in year t is such that the gross debt-to-GDP ratio in year t + 3 is at least 5 percentage points lower than that in year t.

(2.) Most recent empirical studies also suggest evidence strong sustainability either without regime shifts, as own by Cunado, Gil-Alana, and Perez de Gracia 14), or with regime shifts (Arestis, Cipollini, and Fattouh 2004; Martin 2000).

(3.) Diebold and Nason (1990) give four reasons why nonlinear models, although they have better in-sample fit than linear models, may fail to dominate in terms of out-of-sample forecast performance based on the RMSE (see also Clements and Smith 2000).

(4.) Christoffersen and Diebold (1997) show that under asymmetric loss, the optimal forecast is the conditional mean plus a bias term, which depend on both the forecaster's loss function and the conditional variance of predicted variable.

(5.) PP plots provide a visual inspection of the discrepancy between shapes created by the patterns of points on a plot and a reference straight line.

(6.) A high order is chosen because, as noted by Diebold, Gunther, and Tay (1998), dependence may be present in higher moments.

(7.) Recently, an alternative approach to evaluate the accuracy of density forecast has been suggested by Sarno and Valente (2004).

(8.) For robustness, we also estimated the VECM with two lags. The results are very similar to those obtained with one lag, and to save space, we do not report them. The results are available from the authors upon request.

(9.) It is worth noting that empirical distribution type of tests, such as PP plots, is valid only under the assumption that PIT follows an iid process (Spanos 1999).


* The authors wish to thank three anonymous referees. All the computations have been carried out using GAUSS. The authors wish to thank Serena Ng and Bruce Hansen and Byeongseon Seo for making available the GAUSS routines.

Cipollini: University of Essex, School of Accounting, Finance and Management, Wivenhoe Park, C04 3SQ Colchester, UK. Phone +44 1206872314, E-mail [email protected], uk

Fattouh: Department for Financial and Management Studies, CeFiMS, SOAS, University of London, Thornhaugh Street, Russell Square, London WCIH 0XG, United Kingdom. Tel 0044-(20)78984053, Fax 0044-(20) 78984089, E-mail [email protected]

Mouratidis: Swansea University, School of Business and Economics Swansea, Singleton Park, Swansea, SA2 8PP, Wales UK. Phone +44 (0) 1792 295364, Fax +44 (0) 1792 295626, E-mail [email protected]
TABLE 1 Unit Root Tests on the Level of the Series R and G

 ADF PP [ADF.sup.GLS] [MZ.sup.GLS.sub.a]

R -0.339 0.534 0.960 1.051
G 0.498 -0.514 2.053 1.633

TABLE 2 Lag Order for Linear VECM and Threshold

Lag Order AIC BIC

(A) Linear VECM
1 -9.781 -6.887
2 -9.474 -5.156
3 -6.78 -1.055
4 -0.409 6.711

(B) Threshold VECM
1 -12.87 -7.082
2 -11.24 -2.609
3 -6.782 -1.055
4 7.798 22.04

Tests for Threshold Cointegration

[beta] = 1

Lagrange multiplier 18.700
 threshold test statistic
Fixed regressor .062
 asymptotic p value
Bootstrap p value .085

 Wald Test for Equality of

Dynamic Coefficients VECM Coefficients

Wald test = 23.19 Wald test = 6.12
p value = .000 p value = .046

Notes: The p values for the LM threshold test were
obtained by 5,000 bootstrap replications.

Estimates of the Threshold VAR

[beta] = 1

 Threshold estimate = 8.859

 Regime 1


Intercept 0.312 (0.066) 0.068 (0.101)

[w.sub.t-1] -0.010 (0.014) 0.023 (0.022)
[DELTA] [G.sub.t-1] -0.216 (0.135) -0.040 (0.099)
[DELTA] [R.sub.t-1] -0.094 (0.055) 0.101 (0.138)
% Observations in
 regime 82

 Threshold estimate = 8.859

 Regime 2


Intercept 3.137 (1.160) 0.988 (1.336)

[w.sub.t-1] -0.242 (0.096) -0.012 (0.113)
[DELTA] [G.sub.t-1] -0.142 (0.125) -0.514 (0.246)
[DELTA] [R.sub.t-1] -0.088 (0.084) -0.697 (0.153)
% Observations in
 regime 18

Notes: Standard errors in parentheses.

RMSE for Point Forecast

 AR Linear VECM

Forecast Horizon (h) [DELTA] G [DELTA] R [DELTA] G [DELTA] R

1 0.823 1.507 0.824 1.517
4 0.808 1.503 0.774 1.465
8 0.795 1.509 0.807 1.462

 Threshold VECM

Forecast Horizon (h) [DELTA] G [DELTA] R

1 0.821 1.781
4 0.784 1.484
8 0.827 1.514

Notes: The RMSE associated with the point forecasts has been obtained
by recursive estimation of both linear and nonlinear VECMs using the
sample running from January 1989 to April 2004 as the forecast
evaluation period.

TABLE 6 LM Test for iid of PIT

(A) One-Quarter-Ahead Forecasts

 AR Linear VECM


1 .143 .002 .127 .006
2 .560 .284 .510 .104
3 .228 .051 .209 .051


1 .126 .008
2 .515 .120
3 .211 .076

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

1 .122 .006
2 .500 .112
3 .211 .0753

 Threshold VECM

Moment [DELTA] G [DELTA] R

1 .130 .014
2 .463 .132
3 .189 .077


1 .126 .015
2 .469 .138
3 .199 .0824

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

1 .126 .003
2 .548 .228
3 .185 .057

(B) Four-Quarters-Ahead Forecasts

 AR Linear VECM


1 .147 .001 .174 .004
2 .465 .326 .544 .199
3 .362 .093 .264 .050


1 .162 .0103
2 .624 .161
3 .245 .116

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

1 .154 .006
2 .539 .108
3 .249 .0593

 Threshold VECM

Moment [DELTA] G [DELTA] R

1 .097 .012
2 .592 .119
3 .148 .075


1 .108 .007
2 .657 .095
3 .188 .0304

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

1 .095 .011
2 .625 .105
3 .147 .053

(C) Eight-Quarters-Ahead Forecasts

 AR Linear VECM


1 .134 .002 .165 .006
2 .459 .301 .568 .152
3 .319 .099 .272 .058


1 .172 .008
2 .606 .175
3 .265 .089

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

1 .168 .008
2 .505 .147
3 .251 .0656

 Threshold VECM

Moment [DELTA] G [DELTA] R

1 .121 .008
2 .526 .081
3 .181 .057


1 .107 .009
2 .592 .063
3 .175 .028

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

1 .094 .007
2 .512 .097
3 .137 .041

Notes: The table records the p values for [chi square] LM tests of
serial correlation (up to fourth order) for the first, second, and
third moments of the PIT series.

Berkowitz Test

(A) One-Quarter-Ahead Forecasts

 AR Linear VECM


 .000 .000 .000 .000


 .000 .000

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

 .000 .000

 Threshold VECM


 .000 .000


 .072 .000


 .104 .000

(B) Four-Quarters-Ahead Forecasts

 AR Linear VECM


 .013 .000 .000 .000


 .000 .000

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

 .000 .000

 Threshold VECM


 .010 .000


 .013 .000


 .012 .000

(C) Eight-Quarters-Ahead Forecasts

 AR Linear VECM


 .000 .000 .000 .000


 .000 .000

 ([DELTA] G/ ([DELTA] R/
 x [DELTA] R x [DELTA] G

 .000 .000

 Threshold VECM


 .074 .000


 .031 .000


 .048 .000

Notes: The entries are the p values of the Berkowitz (1999)
likelihood ratio test for joint null of normality and iid in PIT *,
which is the inverse of the cumulative normal distribution of the

Probability Forecast Evaluation


(A) Government Spending

AR .467 .660
 .514 .707
 .497 .690
Linear VECM .473 .664
 .443 .635
 .441 .633
Nonlinear VECM .464 .652
 .434 .626
 .440 .632

(B) Tax Revenues

AR .478 .671
 .505 .698
 .499 .692
Linear VECM .496 .692
 .478 .671
 .479 .672
Nonlinear VECM .584 .873
 .481 .678
 .470 .662

Notes: The three entries in each cell (from the top to the bottom)
are the QPS and LPS scores for the one, four, and eight quarters