Nonlinear Options

Options    References

These options are common to all of TSP's nonlinear estimation and simulation procedures: ARCH, BJEST, FIML, LSQ, PROBIT, TOBIT, LOGIT, SAMPSEL, SIML, SOLVE, ML, etc. See Chapter 10 of the User's Guide for further information.

DROPMISS, EPSMIN=<value>, GRADCHEC, GRADIENT=method, HCOV=method, HESSCHEC, HITER=method, MAXIT=<# of iterations>, MAXSQZ=<# of squeezes>, NHERMITE=<value>, PRINT, SILENT, STEP=<squeezing method>, STEPMEM=<num of iterations>, SQZTOL=<squeezing tolerance>, SYMMETRIC, TERSE, TOL=<parameter convergence tolerance>, TOLG=<gradient/CRIT convergence tolerance>, TOLS=<squeezed parameter convergence tolerance>, VERBOSE

Usage

Include these options among any other special options which you supply within parentheses after the name of the command which invokes the estimation procedure:

FIML (ENDOG=(...), nonlinear options) list of eq names ;

Method

The method used for nonlinear estimation is generally a standard gradient method, explained in more detail in Chapter 10 of the User's Guide. Briefly, at each iteration, a new parameter vector is computed by moving in the direction specified by the gradient of the likelihood (uphill), weighting this gradient by an approximation to the matrix of second derivatives at that point (in order to adjust for the curvature). Convergence is declared when the changes in the parameters are all "small", where "small" is defined by the TOL= option.

Options

DROPMISS/NODROPMISS specifies whether observations with missing values in any variables are to b dropped. This can be useful if the equation being estimated varies according to the presence of good data, but use with caution.

EPSMIN=  minimum parameter change for numeric derivatives [default is .0001]. This is also used to control the numeric stepsize when computing HCOV=C and HCOV=U. If you have parameters smaller than .00001 in magnitude, it will be helpful to use an EPSMIN with a value somewhat smaller than your smallest parameter. Otherwise, too large a stepsize is used and the parameters will appear to have zero standard errors.

GRADCHEC/NOGRADCHEC evaluates and compares the analytic and numerical gradient for the current model at the starting values. No actual estimation takes place. Useful for checking derivatives of a new likelihood function. The numeric gradient is evaluated in a time-consuming but accurate way. See the GRAD=C4 option.

GRADIENT= ANALYTIC or C2 or C4 or FORWARD specifies the method of calculating numeric first derivatives.

GRAD=A is the default when analytic first derivatives are available (as is usually the case).

GRAD=FORWARD calculates the numeric derivatives for a given parameter B as

D = (F(B+EPS) - F(B))/EPS

(1 function evaluation per parameter)

GRAD=C2 (CENTRAL2) uses

D = (F(B+EPS) -F(B-EPS))/(2*EPS)

(2 function evaluations per parameter)

GRAD=C4 uses

D = (-F(B+2*EPS) + 8*F(B+EPS) - 8*F(B-EPS) + F(B-2*EPS))/(12*EPS)

(4 function evaluations per parameter)

In all cases, EPS = MAX( ABS(.001*B) , EPSMIN )

HCOV=B or N or G or F or D or W or R or P or Q or U or C or BNW, etc., specifies the method for calculating the asymptotic covariance matrix of the parameter estimates (and standard errors). The default is usually N or B, depending on the procedure. Some procedures may not have N (and thus W) available. A label is printed below each table of standard errors and asymptotic t-statistics identifying the method of calculation used. More than one method may be specified for alternative VCOV matrices and standard errors. In this case, the first method is stored in @VCOV and @SES, and the VCOV for each method is stored under the name constructed by appending the letter to @VCOV. For example, HCOV=NB would store @VCOV, @VCOVN, and @VCOVB. Consult the table below to see which option is the default in a particular procedure.

The P and Q options are for panel data and are available for ML and PROBIT (REI or FEI).  HCOV=P computes grouped BHHH standard errors from the gradient of the objective function using the formula

 instead of the usual formula   where Git is the gradient vector for individual i and period t. Unlike the usual formula, this version of the estimate does not assume independence within individual across different time periods. HCOV=Q computes the robust version of this matrix where N is the Newton (inverse second derivative) matrix. See Wooldridge, p. 407. For linear models, this matrix is exactly equivalent to that computed by OLSQ, 2SLS, LSQ, and PANEL using the HCOMEGA=BLOCK option.

Note that the rank of VP is at most NI, where NI is the number of individuals, so that when NI is less than the number of coefficients K, the grouped panel estimates of the variance-covariance matrix will be singular and therefore probably inappropriate. However in most cases, NI>>K, and this problem will not arise. Note also that for fixed effect estimation, the gradient is always zero for the fixed effects at the optimum, so the block of V corresponding to these effects is zero. For this reason, standard errors for the fixed effects are always computed using the Newton (second derivative) matrix.

HESSCHEC/NOHESSCH  compares analytic and discrete Hessian (differenced analytic gradient)

HITER=B or N or G or F or D specifies the method of Hessian (second derivative matrix) approximation to be used during the parameter iterations. The options are the same as those described above for the estimate of the covariance matrix of the parameter estimates.

Table of HCOV and HITER options

Option

Used to iterate?

Name and description

Procedures for which it is the default

B

yes

BHHH (Berndt-Hall-Hall-Hausman) Covariance of the analytic gradient.

ML

N

yes

Newton (Analytic second derivatives)

ARCH iterations, AR1,PROBIT, TOBIT, LOGIT, SAMPSEL

G

yes

GAUSS (Gauss-Newton). Quadratic form of the analytic gradient and the residual covariance matrix.

LSQ, FIML iterations

F

yes

BFGS (Broyden-Fletcher-Goldfarb-Shanno). Analytic or numeric first derivatives, and rank 1 update approximation of the Hessian from iterations. Usually HITER=F is superior to HITER=D. The HCOV=F option is valid only if HITER=F.

None

D

yes

DFP (Davidon-Fletcher-Powell). Analytic or numeric first derivatives, and rank 1 update approximation of the Hessian from iterations. This option is valid only if HITER=D. For upward compatibility, it implies a default of GRADIENT=C4.

None

W

no

Eicker-White. A combination of analytic second derivatives and BHHH (see the White Reference).

ARCH variance estimate

R

no

Robust. Robust to heteroskedasticity. This is equivalent to W and used in LSQ only.

None

P

no

Panel grouped estimate (allows for free correlation within panel) - PROBIT (FEI; REI) and PANEL only

None

Q

no

Panel grouped estimate robust to heteroskedasticity across units (allows for free correlation within panel) - PROBIT (FEI; REI) and PANEL. For OLSQ, 2SLS, and LSQ, use HCOMEGA=BLOCK

None (except OLSQ, 2SLS, LSQ, PANEL with ROBUST option)

C

yes

Discrete Hessian (numeric second derivatives based on analytic first derivatives)

None

U

yes

Numeric second derivatives

BJEST, MLPROC variance estimates

NBW

no

Print all three standard error estimates.

None

MAXIT= maximum number of iterations. The default is 20.

MAXSQZ= maximum number of "squeezes" in the stepsize search. The default depends on STEP:

STEP option

MAXSQZ default

BARD

10

BARDB

10

CEA

10

CEAB

10

GOLDEN

20

Note that some routines (BJEST, LOGIT) reserve MAXSQZ=123 for special options.

NHERMITE=  number of Hermite quadrature points for numeric integration, used for PROBIT (REI).  [default is 20]

PRINT/NOPRINT produces short diagnostic output at each iteration, including a table of the current parameter estimates and their change vector. The value of the objective function for each squeeze on the change vector is also printed. CRIT is the norm of the gradient in the metric of the Hessian, which approaches zero at convergence.

SILENT/NOSILENT suppresses all printed output.

STEP= BARD or BARDB or CEA or CEAB or GOLDEN specifies the stepsize method for squeezing. The default depends on HITER and the procedure:

HITER option

STEP default

Newton

CEA

BHHH

CEA

Gauss

BARD (for LSQ); CEAB (for FIML)

DFP

GOLDEN

STEPMEM= n, the number of iterations at which to reset the stepsize. The initial stepsize lambda is remembered from the previous iteration, and reset to lambda=1 every nth iteration. The default is STEPMEM=1, which resets lambda=1 every iteration. This option may be helpful if the natural stepsize for a model is different from 1 and tends to be about the same for successive iterations.

SQZTOL= tolerance of determining stepsize. Used for STEP=GOLDEN. The default is 0.1.

SYMMETRIC/NOSYMMETRIC is an old option which has been replaced with GRADIENT=method. SYMMETRIC is the same as GRAD=C4; NOSYM is equivalent to GRAD=FORWARD.

TERSE/NOTERSE produces brief output consisting of the objective function for the estimation procedure, and a table of coefficient estimates and standard errors.

TOL= tolerance of determining convergence of the parameters, using a unit stepsize. The default for most procedures is .001; for AR1 it is .000001.

TOLG= tolerance of determining convergence of the norm of the gradient (printed as CRIT in the output). The default is .001. CRIT = g'H-1g , which is usually many orders of magnitude smaller than .001 .

TOLS= tolerance of determining convergence of the parameters, using the squeezed step. The default is 0, that is, ignore the squeezed change in parameters and use the regular TOL instead.

VERBOSE/NOVERBOSE produces lots of diagnostic output, including the gradient, Hessian, and inverse Hessian at each iteration, and the non- inverted Hessian for each output VCOV.

Starting values:

The default values depend on the procedure. For the standard TSP models which are (potentially) nonlinear in the parameters (LSQ,FIML,ML), the user provides them with PARAM and SET. PROBIT and LOGIT use zeros. TOBIT uses a regression and formulas from Greene (1981). SAMPSEL uses probit and a regression.

The default is overridden in the linear model procedures (ARCH, BJEST, PROBIT, TOBIT,LOGIT, and SAMPSEL) if the user supplies a matrix named @START. The length of @START must be equal to the number of parameters in the estimation (otherwise it is ignored). The easiest way to create @START is with a statement like:

MMAKE @START 12.3 4.56 33 44 55;

The order of the parameters in @START is obvious for most of the linear models, except for the following:

TOBIT: SIGMA comes last.

SAMPSEL: the probit equation is first, then the regression equation, and then SIGMA and RHO.

References

Berndt, E. K., B. H. Hall, R. E. Hall, and J. A. Hausman, "Estimation and Inference in Nonlinear Structural Models," Annals of Economic and Social Measurement (October 1974), pp. 653-665.

Calzolari, Giorgio, and Gabriele Fiorentini, “Alternative Covariance Estimators of the Standard Tobit Model,” Paper presented at the World Congress of the Econometric Society, Barcelona, August 1990.

Calzolari, Giorgio, and Lorenzo Panattoni, “Alternate Estimators of FIML Covariance Matrix: A Monte Carlo Study,” Econometrica 56 (1988), pp. 701 714.

Fletcher, R., Practical Methods of Optimization, Volume I: Unconstrained Optimization, John Wiley and Sons, New York, 1980.

Gill, Philip E., Walter Murray, and Margaret H. Wright, Practical Optimization, Academic Press, New York, 1981.

Goldfeld. S. M. and R. E. Quandt, Nonlinear Methods in Econometrics, North- Holland, 1972.

Greene, William H., "On the Asymptotic Bias of the Ordinary Least Squares Estimator of the Tobit Model," Econometrica 49 (March 1981), pp. 505-513.

Quandt, Richard E., "Computational Problems and Methods," in Griliches and Intriligator, eds., Handbook of Econometrics, Volume I, North-Holland Publishing Company, Amsterdam, 1983.

White, Halbert, "Maximum Likelihood Estimation of Misspecified Models", Econometrica 50 (1982), pp. 1-2

White, Halbert, “A Heteroskedasticity Consistent Covariance Matrix and a Direct Test for Heteroskedasticity”, Econometrica 48 (1980), pp. 721 746.

Wooldridge, J. M., Econometric Analysis of Cross Section and Panel Data, Cambridge, MA: MIT Press, 2002.