Vector Autoregression (VAR) Theory

When we have several time series, we need to take into account the interdependence between them. The VAR model is a very useful starting point in the analysis of the interrelationships between the different time series.
The VAR is just a multiple time-series generalization of the AR model. The VAR model is easy to estimate because we can use the OLS method. The VAR is commonly used for forecasting systems of interrelated time series and for analyzing the dynamic impact of random disturbances on the system of variables.
where yt is a k vector of endogenous variables, xt is a d vector of exogenous variables, A1,…, Ap and β are matrices of coefficients to be estimated, and εt is a vector of innovations that may be contemporaneously correlated with each other but are uncorrelated with their own lagged values and uncorrelated with all of the right-hand side variables.
 In practice, since we are not considering any moving average errors, the autoregressions would probably have to have more lags to be useful for prediction. Otherwise, univariate ARMA models would do better. Suppose that we consider say six lags for each variable and we have a small system with four variables. Then each equation would have 24 parameters to be estimated and we thus have 96 parameters to estimate overall. This overparameterization in one of the major problems with VAR model. The unrestricted VAR models have not been found very useful for forecasting and other extensions using some restrictions on the parameters of the VAR models have been suggested.
 A VAR is in a sense a systems regression model i.e. there is more than one dependent variable.
 Simplest case is a bivariate VAR
where uit is an iid disturbance term with E(uit)=0, i=1,2; E(u1t u2t)=0.
Vector Autoregressive Models: Notation and Concepts
 This model can be extended to the case where there are k lags of each variable in each equation:
     yt = b0 + b1 yt-1 + b2 yt-2 +...+ bk yt-k + ut
     gx1 gx1 gxg gx1 gxg gx1 gxg gx1 gx1
 We can also extend this to the case where the model includes first difference terms and cointegrating relationships (a VECM).
Vector Autoregressive Models Compared with Structural Equations Models
 Advantages of VAR Modelling
- Do not need to specify which variables are endogenous or exogenous - all are endogenous
- Allows the value of a variable to depend on more than just its own lags or combinations of white noise terms, so more general than ARMA modelling
- Provided that there are no contemporaneous terms on the right hand side of the equations, can simply use OLS separately on each equation
- Forecasts are often better than “traditional structural” models.
Problems with VAR’s
- VAR’s are a-theoretical (as are ARMA models)
- How do you decide the appropriate lag length?
- So many parameters! If we have g equations for g variables and we have k lags of each of the variables in each equation, we have to estimate (g+kg2) parameters. e.g. g=3, k=3, parameters = 30
- Do we need to ensure all components of the VAR are stationary?
- How do we interpret the coefficients?
Impulse Response Functions
 VAR models are often difficult to interpret: one solution is to construct the impulse responses and variance decompositions.
 Impulse responses trace out the responsiveness of the dependent variables in the VAR to shocks to the error term. A unit shock is applied to each variable and its effects are noted.
 Consider for example a simple bivariate VAR(1):
 A change in u1t will immediately change y1. It will change y2 and also y1 during the next period.
 We can examine how long and to what degree a shock to a given equation has on all of the variables in the system.
 A shock to the i-th variable not only directly affects the i-th variable but is also transmitted to all of the other endogenous variable through the dynamic (lag) structure of the VAR. An impulse response function traces the effect of a one standard deviation shock to one of the innovations on current and future values of the endogenous variables.
 If the innovations εt are contemporaneously uncorrelated, interpretation of the impulse response is straightforward. The i-th innovation εi,t is simply a shock to the i-th endogenous variable yi,t.
 For stationary VARs, the impulse responses should die out to zero and the accumulated responses should asymptote to some (non-zero) constant.
Variance Decomposition
 Variance decompositions offer a slightly different method of examining VAR dynamics. They give the proportion of the movements in the dependent variables that are due to their “own” shocks, versus shocks to the other variables.
 This is done by determining how much of the s-step ahead forecast error variance for each variable is explained innovations to each explanatory variable (s = 1,2,…).
 The variance decomposition gives information about the relative importance of each shock to the variables in the VAR.
Impulse Responses and Variance Decompositions: The Ordering of the Variables
 But for calculating impulse responses and variance decompositions, the ordering of the variables is important.
 The main reason for this is that above, we assumed that the VAR error terms were statistically independent of one another.
 This is generally not true, however. The error terms will typically be correlated to some degree.
 Therefore, the notion of examining the effect of the innovations separately has little meaning, since they have a common component.
 What is done is to “orthogonalise” the innovations.
 In the bivariate VAR, this problem would be approached by attributing all of the effect of the common component to the first of the two variables in the VAR.
 In the general case where there are more variables, the situation is more complex but the interpretation is the same.
Granger Causality Tests
 We can test Granger causality by running a VAR on the system of equations and testing for zero restrictions on the VAR coefficients. The Granger (1969) approached to the question of whether x causes y is to see how much of the current y can be explained by past values of y and to see whether adding lagged values of x can improve the explanation. The y is said to be Granger-caused by x if x helps in the prediction of y, or equivalently if the coefficients on the lagged x’s are statistically significant. Note that the two-way causation is frequently the cases; x Granger causes y and y Granger causes x.

The restricted model is therefore 

 The test statistic is standard Wald F-statistic 
 Where T is the number of obesrvations used in the unrestricted model, ESSU is the error sum of squares, and ESSR is the error sum of squares for restricted model

Presented by Dr. Babar Zaheer Butt to the students of MS/PhD at Iqra University Islamabad