H O M E / I N F O / P R O D U C T S / O R D E R / S U P P O R T

Contents / Getting started / FAQ / Recent changes / Updates

[TSP International]

TSP Version 4.4
Documentation for New and Enhanced Features

Together with the December 1996 printing of the Reference Manual and User's Guide for TSP version 4.3, these notes are a complete set of documentation for TSP Version 4.4. New editions of the complete manuals will be issued in late Fall 1997. The new and revised features of TSP 4.4 are the following:

1. A companion program, TSP through the Looking Glass, provides an easy-to-use Windows interface for the development of TSP programs. Looking Glass is documented separately (see tlg.txt on the tlg diskette and the TLG help system, which you may access by selecting Contents... under the Help menu.

2. ML PROC A powerful enhancement to the maximum likelihood procedure (ML) allows you to evaluate the log likelihood function in a TSP procedure (PROC) rather than as a single equation. This enables maximum likelihood estimation of several new classes of models: recursive time series (e.g., GARCH, ARIMA), hyperparameter estimation in state space/Kalman filter models, simulation models such as multivariate normal integration, and even simple things like concentrated log likelihood functions. It is also easy to use MLPROC to impose inequality constraints on the input parameters (or functions of them), such as on the variance function (h(t)) in GARCH models. See the TSP examples library at for many examples of using MLPROC.

3. Regression output has been improved by adding more default tests and P-values. The new diagnostics are a general purpose heteroskedasticity test LMHET, the Shapiro-Wilk normality test, Ramsey's RESET test, and the Kullback-Leibler ("deviance") R-squared for Probit and Logit.

4. The PANEL procedure has been enhanced to handle panels with missing values, lags, and leads. A new PANEL option has been added to FREQ in order to facilitate future language development in the handling of PANEL data. The Durbin-Watson and LMHET diagnostics have been added to PANEL.

5. The instrumental variable procedures (2SLS, 3SLS, GMM) have been revised to speed computation where there are large numbers of instruments and observations; as a byproduct, 2SLS memory usage has been reduced when there are few endogenous variables (i.e., when instruments and right hand side variables coincide for the most part). In addition, a global FAST option has been added to allow even faster computation of regression estimates when great accuracy is not required, or the researcher knows that the matrix of instruments and the matrix of right hand side variables or derivatives are not ill-conditioned. This latter improvement applies to all of the estimation procedures, linear (especially OLSQ) or nonlinear.

6. Long variables names (up to 64 characters) are now allowed. This has several advantages: compatibility with databases such the OECD's and Citibase, easier construction of variable names in DOT, FORM, and DIFFER.

7. MMAKE a new VERT option has been added to stack matrices or series vertically. Without the option, they are stacked horizontally.

8. FORM (VARPREF=<prefix>) can be used to give the coefficients in the created equation names formed by concatenating <prefix> with the variable name.

9. Minor corrections and enhancements see the extended lists at the end of this document. These lists are divided into 5 sections:

9.1 General changes (in 4.4 only).
9.2 General changes (also in some copies of 4.3).
9.3 Command-specific improvements (in 4.4 only).
9.4 Command-specific improvements (also in some copies of 4.3).
9.5 Non-upward compatibilities.
2. ML PROC 

Sometimes it is extremely difficult, or impossible to write down the log likelihood in a single FRML (even with use of EQSUB). See the list below for some examples. For this form of the ML command, write a PROC to evaluate the log likelihood and store it in @LOGL. Give the name of this PROC as the first argument (after any options) of the ML command, and follow it with a list of PARAMs to be estimated. Write the PROC so that it starts by checking any constraints on the PARAMs. If any constraint is violated, set @LOGL to @MISS before exiting from the PROC. OPTIONS DOUBLE; is advised if you want to use double precision to form intermediate results such as residuals.

The main disadvantage of using PROC instead of FRML is that analytic derivatives are not available. However, numeric derivatives (default HITER=F and GRAD=C2) will often be quite adequate. A slight disadvantage is that you have to list the PARAMs to be estimated explicitly in the command line. This is required because a single FRML which collects all the PARAMs in one place may not exist.

Good Applications for the PROC method:

1. Time series models like ARMA and GARCH, where the equations are recursive (depend on residuals or variance from the previous time period(s). Models which can be evaluated by the KALMAN command also fit into this category (ML thus allows estimation of the hyperparameters).

2. Multi-equation models like FIML. These involve Jacobians, matrix inverses, and determinants, which would have to be written into the log likelihood equation explicitly (very difficult for more than about 4 equations unless the Jacobian is sparse).

3. Models which require several diverse commands to evaluate, such as multivariate normal integrals via simulation, or other functions not built into TSP. Another example in this class is a concentrated log likelihood function.

4. Models with complicated constraints. A good example would be ARCH models, where the conditional variance must be positive for every observation.

Bad Applications for either method:

1. Existing linear models in TSP (PROBIT, TOBIT, LOGIT, SAMPSEL). The regular TSP commands are more efficient, more resistant to numerical problems, often have better starting values, and provide model-specific statistics. See Timing example below.

To give an idea of how much this convenience costs in terms of CPU time, here is a timed example run on the VAX 11/780 of a Probit on 385 observations, 8 variables.

Time in Procedure
CPU seconds 
05.24 PROBIT command.
65.65 ML(HITER=B,HCOV=N)
75.97 ML(HITER=N,HCOV=N)

The moral is that ML should not be used when you have a Fortran-coded alternative estimation program, but could be useful if you don't want to spend your time developing such a program. Also, in this case, the method of scoring was somewhat faster than Newton's method, although the latter is more powerful (it takes fewer iterations).

Tips:

1. Write the equation carefully to avoid things like Log(x<=0) or Exp(x>88). These are fatal errors if they happen in the first function evaluation (using the starting values). They are not fatal during iterations (the program automatically uses a smaller stepsize), but they can be inefficient. Often these problems can be avoided by reparametrizing the likelihood function. The standard example of this is estimating SIGMA (or SIGMA- inverse) instead of SIGMA-squared.

If you are getting numerical errors and you can't rewrite the likelihood function, try using SELECT to remove the problem observations. After you get convergence, use the converged values as starting values and reestimate using the full sample.

2. Choose starting values carefully (see previous).

3. Use EQSUB for less work rewriting equations and more efficient code.

4. The "Working space=" message is an indication of the length of the derivative code.

5. If the second derivative matrix is singular, you may have sign errors in the log likelihood function (the inversion routine assumes the second derivative matrix is negative definite).

6. If you are using derivatives, make sure the functions you use are differentiable. Logical operations are not differentiable everywhere, although they are diffferentiable at all but a finite number of points. TSP will do the best it can with them, but if you end up on a kink (corner), it may stall.

Examples:

PROC method. Here is a simple concentrated log likelihood function, where we estimate the mean of a time trend, and concentrate out the variance parameter to reduce the nonlinearity of the function.

OPTIONS DOUBLE; ? make sure that residuals are stored in double precision
SMPL 1,9;
TREND T; ? yields same results as MSD T; or OLSQ T C;
PARAM MT,2;
ML NRMLC MT; ? PROC form of the ML command
PROC NRMLC;
E = T - MT; ? residual
MAT SIG2 = (E'E)/@NOB; ? sigma-squared (variance)
SET PI = 4*ATAN(1); ? tan(pi/4) = 1
SET @LOGL = -(@NOB/2)*( LOG(SIG2) + 1 + LOG(2*PI) );
ENDPROC;
 
3. REGRESSION OUTPUT (REGOPT)

In the past, we have made many diagnostics available, but very few of them were used by default. Now, with PCs running at ever-higher speeds, we feel the extra cost of the new diagnostics is minimal, while the benefits many. We have tried to keep the default number of lines of output about the same, since producing a huge list of (possibly redundant) diagnostics would dilute their value.

New diagnostics:

SWILK - Shapiro-Wilk normality test (including the P-value) has been found to be superior to the Jarque-Bera normality test in small samples (the Jarque-Bera test is an asymptotic LM test). The SWILK test is based on the sorted residuals; for details see Shapiro and Wilk (1965). TSP uses Applied Statistics algorithm R94, which can be viewed at the Applied Statistics website at Carnegie-Mellon University.

RESET - Ramsey's RESET test is a Lagrange multiplier test of functional form; the residuals from the linear model are regressed on the independent variables and powers of the fitted dependent variable. The number of powers is controlled by the RESETORD=k option in REGOPT. The RESET test with RESETORD=2 (i.e., the alternative is quadratic terms in the regression) is printed by default as part of the regression output. The test statistic may also be significant if there is an outlier that is fit well by a quadratic function of x.

LMHET - a simple LM heteroskedasticity test: a regression of the squared residuals on the squared fitted values and a constant term. There is no special name or reference for this test, but it can be considered a special case of the Breusch-Pagan heteroskedasticity test (also provided by TSP, see BPHET in the REGOPT documentation). This test is provided by default for all regression procedures that do not generally have right hand side endogenous variables, that is, OLSQ, SUR, VAR, and PANEL. It is not provided for AR1, 2SLS, 3SLS, GMM, or FIML since it is not valid in the case of an endogenous right hand side variable.

KLRSQ - the Kullback-Leibler R-squared, sometimes called the deviance R-squared, pseudo-R-squared, or McFadden's R-squared is a generalization of the R-squared for a large class of nonlinear econometric models (such as Probit, Logit, GLS, and models for count data). See Cameron and Windmeijer (1997) for a detailed description. TSP now reports this R-squared statistic for Probit and Multinomial Logit models. For this type model, the statistic is equal to

1 - log L(fitted model)/log L(model with intercept only)

Thus the statistic will be zero if the predictive power of the model is no better than simply using the average probability for all observations (conditioning on the X variables does not help in predicting the outcome). It will approach unity as likelihood for the fitted model becomes much larger than that for a model with an intercept only, although it will never reach one unless one of the X's is a perfect predictor.

Default diagnostics added to the regression procedures:

These diagnostics are also stored in TSP memory under the names @test (for the statistic) and %test(for its P-value) for your further use.

@LMHET, %LMHET. This is the default heteroskedasticity diagnostic.

@RESET2, %RESET2. This is the default functional form diagnostic (OLSQ only).

@JB, %JB. (Jarque-Bera normality test). This is the default test for nonnormality (OLSQ only).

Default P-values added to the regression output:

TSP provides almost all P-values by default, except for the exact P-value of the Durbin-Watson (this is a fairly slow calculation, especially for large numbers of observations; it can be requested optionally, but is not the default). Those added in version 4.4 are the following:

DW - Durbin-Watson statistic. There are different types of P-value for this (see the Changes to REGOPT below). The default depends on the current frequency of the data:

For FREQ N;, we assume the user probably has a cross section dataset, so we compute the quickest P-value (the option APPROX). This is displayed as [<pvalue]. It is a conservative upper bound, i.e. it is always larger than the precise upper bound. Under the null of no serial correlation, the probability of drawing the observed serial correlation of the residuals is less than or equal to the P-value shown.

For other frequencies (FREQ A,Q,M,W, or numeric), where the data are clearly time series, so TSP computes the BOUNDS version of the P-values (this is the version familiar from elementary econometrics textbooks and courses). The precise upper and lower bounds for the probability of seeing the observed serial correlation under the null is displayed as [plow,phigh]. This version is also computed when the PANEL frequency option is on or when the PANEL procedure is used (in this case the Durbin-Watson statistic is computed only within the individual units, and the bounds given by Bhargava, Franzini, and Narendanantham (1982) are used).

The user can override these defaults by using the DWPVAL option in REGOPT (see below). In the case of instrumental variable estimation, the number of instruments (rather than the number of right hand side variables) is used when computing the P-value bounds.

DH - Durbin's h (when there is a single lagged dependent variable).

DHALT - Durbin's h alternative (when there is at least one lagged dependent variable).

FST - F statistic for the null of zero slopes.

T - Student's t-statistics for regression coefficients is given for non-OLS commands as well, but it is usually asymptotically normally distributed rather than Student's t distributed in these cases. Because of this, TSP uses the normal distribution rather than the t distribution to compute the P-values for LSQ, INST, etc.

FBEX - F (block exogeneity) - VAR is a standard Granger causality test where the null is that lags of the other dependent variables do not enter the equation for the current variable in the presence of its own lags. This test has the standard F-distribution under the null and the corresponding p-value is reported.

FOVERID - F (overidentifying restrictions) - LIML is an asymptotic F-test that all the "extra" instruments satisfy the constraint that they are orthogonal to the disturbance. The p-value under the null that the overidentifying restrictions hold is printed.

Changes to the REGOPT command:

PVCALC and PVPRINT are now "on" by default. Previously the defaults were NOPVCALC and NOPVPRINT.

RESETORD=k. Controls the polynomial order of Ramsey's RESET test. The default value is 2. See above under "New diagnostics" for more information.

DWPVAL=APPROX or BOUNDS or EXACT option specifies the method used to calculate the Durbin-Watson P-value. For this options default value, see above under "P-values added". Also see above for a description of how the P-value is displayed. The exact P-value is the only one displayed in the usual manner, for example, [.006].

APPROX - a conservative upper bound on the P-value is computed by fitting a nonlinear regression in @NOB and @NCOEF to the dL values in the standard table. It is the quickest of the three options to compute.

BOUNDS - precise upper and lower bounds for the P-value are computed by considering the minimal and maximal possible sets of eigenvalues for the exact test. These are used for computing the dL and dU values in the standard tables; the general formula is given in Bhargava et al (1982).The calculation may be somewhat slow for large numbers of observations.

EXACT - the exact P-value, calculated from the positive eigenvalues of the (T-1) x (T-1) symmetric matrix

Q = DD' - (DX)(X'X)"(DX)'. ( denotes matrix inverse, as in the MAT command)

where D is a (T-1) x T matrix whose first row is [-1 1 0 0 ... 0], so that DX are the first differences of X. The positive eigenvalues of Q are the same as those for MA, where A = D'D and M is the annihilation matrix for X (M = I-X(XX)X). MA is the matrix mentioned in the Durbin-Watson(1951) classic reference, but Q is much easier to compute than MA. Once the eigenvalues are obtained, the P-value of a quadratic form in normal variables is obtained using Applied Statistics algorithm 153 R52 (from StatLib). This calculation can be quite slow, and memory intensive, for large numbers of observations.

Diagnostics removed

DW - removed when DH or DHALT are calculated, that is when there are lagged dependent variables. The DW is biased in this case, so printing a P-value would be misleading. Note that a DW is still calculated in other commands like 2SLS and VAR; it will be biased here also if lagged dependent variables have been included in the variable list.

FST - removed from AR1, BJEST, and 2SLS, because it was not calculated appropriately for those estimators.

4. PANEL

PANEL is now able to handle series with missing values (like the other estimation procedures, it will drop observations containing missing values in one or more of the variables, including observations that use nonexistent lags or leads). A new PANEL option has been added to FREQ so that series can be explicitly typed as panel data. The panel information is shared with PANEL, PRINT, and AR1(TSCS) at the present time, so you no longer have to check for missing values by hand, or set gapped SMPLs for AR1(TSCS).

For example, suppose you have data for 4 years on 10 cross sectional units, with an ID variable called PID:

Obs. PID Year 
1 101 86
2 101 87
3 101 88
4 101 89
5 102 86
6 102 87
.
.
40 110 89

The following statements define the frequency and sample:

FREQ (PANEL,T=4) ;
SMPL 1 40 ;

This specifies the data will be a balanced panel with 4 time periods of data per cross section unit. Alternatively, you could specify the panel structure using the ID variable instead of the T option:

FREQ (PANEL,ID=PID) ;

If the panel is unbalanced (different numbers of time periods for each unity), the second form of the FREQ statement must be used. The ID variable does not have to exist when the FREQ command is given, so the FREQ command can be placed at the top of a run, and ID can be READ in later in the TSP run.

To describe the above panel data more fully, you could supply the time series frequency (Annual) and starting date(1989). Although this will not affect estimation, it will affect labelling:

FREQ(PANEL,T=4,START=89) A;

You can even specify redundant information, as a check on the integrity of the ID series. TSP will check to make sure the number of time periods and number of individuals matches the ID series:

FREQ(PANEL,ID=PID,T=4,N=10,START=89) A;

In TSP 4.4, the advantage of specifying the panel structure in the FREQ(PANEL,...) command is that you do not have to specify these options repeatedly in each PANEL command or (via the old SMPL gaps) in AR1(TSCS) commands. In addition, all Durbin-Watson statistics will skip differenced residuals across different individuals, and lags (or leads) will not operate across the different units (in GENR, SELECT, etc.). In future versions of TSP, we plan more enhancements which use this FREQ(PANEL,...) information.

New diagnostics for Panel Data:

LMHET -- this is included by default in the Panel regression output, because heteroskedasticity is common in such data.

DW -- the BOUNDS version of the Durbin-Watson is displayed in the Panel regression output by default. It is computed using the outline in Bhargava, et al (1982), which handles unbalanced data. A Fortran version of Applied Statistics algorithm 256 (StatLib), which is like the algorithm AS 153 (Pan), but allows for multiple equal eigenvalues (Imhof), is used.

7. MMAKE

A VERTICAL option has been added to stack series or matrices vertically. Suppose that you have 3 series named x1, x2, and x3 with 100 observations each and you wish to make a 100 by 3 matrix with these series. You would use the command below to accomplish this task:

MMAKE xmat x1 x2 x3 ;

But if you wish to make a column vector of length 300 with the 3 series stacked one after the other, use the command

MMAKE (VERT) xstack x1 x2 x3 ;

to accomplish this. It will also work with (conformable) matrices. For example, suppose you have matrices z1 (3 by 2), z1t (2 by 3), and z2 (3 by 3). The command

MMAKE z3 z1 z2 ;

makes the 3 by 5 matrix z3 = [z1 z2]. The command

MMAKE (VERT) z4 z1t z2 ;

makes the 5 by 3 matrix z4 =

8. FORM (VARPREF=<prefix>,RESID,CONST): 

This new VARPREF option allows you to attach a coefficient prefix of your choice to the variable names from a regression when forming the names of the coefficients in the equation. The easiest way to see it work is by example:

FORM (VARPREF=B) EQB Y C X1 X2 ;

makes and stores an equation called eqb as though the following FRML command had been issued:

FRML EQB Y = B0 + BX1*X1 + BX2*X2 ;

Note the use of 0 as the suffix for the intercept (the constant term). It is possible to use VARPREF with leads and lags of the right hand side variables; in that case, FORM adds the absolute value of the lag to the name if the variable is lagged, and L followed by the lead if the variable is lead:

FORM (VARPREF=F) EQ GDP C CONS CONS(-1) CONS(+1) X ;

yields

FRML EQ GDP = F0 + FCONS*CONS + FCONS1*CONS(-1)
+ FCONSL1*CONS(+1) + FX*X ;

The RESID option (included in some copies of TSP 4.3) allows you to create an unnormalized version of the equation. For example,

OLSQ Y C X ;
FORM (RESID, PREFIX=B) EQ1 ;

yields

FRML EQ1 Y- (B0+B1*X) ;

The CONST option with a list of equations as arguments to the FORM command causes all of the scalar variables in the equations (e.g., B0, B1 in the above example) to be converted to their current values. This is useful for printing out an equation or storing it without having to worry about defining the parameter values. However it does mean that the equation will only be useful for simulation in the future and not for estimation.

9. OTHER CORRECTIONS & ENHANCEMENTS

9.1 General changes (in 4.4 only)

9.2 General Changes (released in some copies of 4.3)

9.3 Command-specific improvements (in 4.4 only)

9.4 Command-specific improvements (released in some copies of 4.3)

9.5 NON-UPWARD COMPATABILITIES

Additional References

Bhargava, A., L. Franzini, and W. Narendanantham, "Serial Correlation and the Fixed Effects Model," Review of Economic Studies XLIX (1982): 533-549.

Cameron, A. C., and F. A. G. Windmeijer, "R-Squared Measures for Count Data Regression Models with Applications to Health Care Utilization," Journal of Business & Economic Statistics 14 (1996): 209-220.

Cameron, A. C., and F. A. G. Windmeijer, "An R-Squared Measure of Goodness of Fit for Some Common Nonlinear Regression Models," Journal of Econometrics 77 (1997): 329-342.

Cheung, Yin-Wong, and Lai, Kon S., "Lag Order and Critical Values of the Augmented Dickey-Fuller Test," Journal of Business & Economic Statistics 13 (July 1995): 277-280.

MacKinnon, James G., "Critical Values for Cointegration Tests," Long-Run Economic Relationships: Readings in Cointegration, eds. R.F. Engle and C.W.J. Granger, New York: Oxford University Press, (1991): 267-276.

Royston, Patrick, "Algorithm AS R94," Applied Statistics 44 (1995).

Savin, N.E., and Kenneth J. White, "Testing for Autocorrelation with Missing Observations." Econometrica 46 (1978): 59-67.

Shapiro, S. S., and M. B. Wilk, "An Analysis of Variance Test for Normality (Complete Samples)," Biometrika 52 (1965): 591-611.

Shapiro, S. S., M. B. Wilk, and H. J. Chen, "A Comparative Study of Various Tests of Normality," Journal of the American Statistical Association 63 (1968): 1343-1372.

Statlib, http://lib.stat.cmu.edu/apstat/.


If you have any questions or comments about TSP please send an email to info@tspintl.com. Comments or questions about this website should be sent to the webmaster. Lost? Please consult the site map.