Econometric Benchmarks
(Last updated: 9/29/2004)
Here are some standard benchmark datasets and models for testing
the accuracy of TSP or other econometrics packages.
Most of the models are just classic published results; others
are designed to be numerically difficult (designated as [difficult] here).
Jump down to section:
- Basic Statistics, OLS, NLS
- Simultaneous equations: SUR, 2SLS, LIML, 3SLS, FIML
- AR(1) regression
- unit root, cointegration testing
- Durbin-Watson test - tables of critical values
- ARIMA (Exact ML)
- GARCH
- Logit, Probit
- Count data: Poisson, NegBin1, NegBin2, Ordered
Probit, panel data
- Panel data: Static models
- Panel data: Dynamic models
Basic Statistics, Linear and Nonlinear Regression
- linear regression:
Longley, JASA (1967) [difficult]
longley.tsp
text data, benchmark, TSP code
longley.wks
longley.xls
spreadsheet version of data
Note: to download .WKS and .XLS files, try
the right mouse button on some browsers.
longley.htm Some brief research
(2/1997) on obtaining a Longley coefficient vector accurate to
11 digits.
- missing data, means, correlation, linear regression:
Wilkinson's "Statistics Quiz" [difficult]
nasty.tsp text data, code
wilk.txt
Full original text of the "Statistics Quiz",
including the correct answers.
nasty.wks
nasty.xls spreadsheet data
For comparisons of how
several packages have fared on this benchmark, see:
- Sawitzki, G. "Testing numerical reliability of data
analysis systems,"
Computational Statistics and Data Analysis 18,
1994, pp.269-286.
- National Institutes of Standards and Technology (NIST)
"Statistical Reference Datasets"
NIST StRD web page
Contains many test problems and certified results for univariate
statistics, anova, linear regression, and nonlinear regression.
- Univariate statistics [difficult]
- ANOVA [difficult]
- ano.tsp text data, benchmarks,
code to run 11 models
- Table of results for Sun 4
TSP obtains at least 7 correct digits for the
F-statistic in 8 of the 11 models.
-
nistanow.zip
siresist.wks
agatomic.wks
simonles.wks
nistanox.zip
siresist.xls
agatomic.xls
simonles.xls
spreadsheet data (including Simon-Lesage 1,4,7).
Simon-Lesage 3,6,9 are too large (18,009 obs) for
.xls files, and 2,5,8 are large but similar to 1,4,7.
- Ordinary Least Squares [difficult]
- Nonlinear Least Squares [difficult]
For comparisons of how
several packages have fared on these benchmarks, see:
- McCullough, Bruce D., "Assessing the Reliability of
Statistical Software: Part I,"
The American Statistician 52, November 1998, pp.355-363.
- McCullough, Bruce D., "Assessing the Reliability of
Statistical Software: Part II,"
The American Statistician 53, May 1999, pp.149-159.
- McCullough, Bruce D., "Econometric Software Reliability:
EViews, LIMDEP, SHAZAM, and TSP,"
Journal of Applied Econometrics 14, 1999, pp.191-202.
- Lilien, David M., "Econometric Software Reliability and
Nonlinear Estimation in EViews: Comment,"
Journal of Applied Econometrics 15, 2001, pp.107-110.
- McCullough, Bruce D., "Reply,"
Journal of Applied Econometrics 15, 2001, pp.111+
- McCullough, Bruce D. and Wilson, Berry,
"On the accuracy of statistical procedures in Microsoft Excel 97,"
Computational Statistics and Data Analysis 31, 1999, pp.27-37.
Simultaneous Equations
- SUR (Seemingly Unrelated Regressions), iterated SUR,
(iterated) SUR with cross-equation constraints:
Grunfeld investment data, 2 firms, following Theil (1971)
grunsur2.tsp text data, benchmarks, code
grunsur2.wks
grunsur2.xls spreadsheet data
- SUR (Seemingly Unrelated Regressions), iterated SUR,
(iterated) SUR with cross-equation constraints, singular equation
system:
Berndt and Christensen, US manufacturing labor/capital data,
3 equations, following Berndt and Savin (1975)
bs75d.tsp text data, benchmarks, code
- 2SLS, LIML, 3SLS, FIML:
Klein I model
Combined in a single file, with references to original
articles with correct and incorrect results
klein.tsp text data, benchmarks, code
klein.wks
klein.xls spreadsheet data
Individual files
- 2SLS and 3SLS: Klein I
klein3s.tsp text data, code,
references
- LIML: Klein I
kleinlml.tsp text data, code,
references
- FIML: Klein I
kleinfml.tsp text data, benchmark,
code, references
kleinfm2.tsp high precision
benchmark, including 3 different types of standard errors
- 2SLS, LIML, 3SLS, iterated 3SLS, FIML:
Kmenta's simple supply/demand model
kmenta.tsp text data, benchmarks, code
kmenta.wks
kmenta.xls spreadsheet data
- nonlinear FIML:
Bodkin and Klein model
bodkin.tsp text data, benchmarks, code
Time Series
- AR(1) regression models (conditional and exact ML)
Changes to TSP's AR1 command 8/1998
Note: the AR1 code used in the examples below does not take
advantage of the 8/1998 changes. It uses the older grid search
and iteration methods before they were combined on 8/1998.
- AR(1) conditional ML via grid search (Hildreth-Lu),
AR(1) conditional ML via full iteration (Cochran-Orcutt),
AR(1) exact ML via full iteration (Beach-MacKinnon),
Durbin-Watson statistic and its exact P-value:
Bartlett pears data -- analyzed by Hildreth and Lu,
and by Henshaw
pears.tsp text data, benchmark, code
pears.wks
pears.xls spreadsheet data
- AR(1) conditional and exact ML:
Longley data -- following Lovell and Selover [difficult]
longar1.tsp text data, benchmark, code
longley.wks
longley.xls spreadsheet data
- AR(1) and AR(2) conditional and exact ML:
Klein I consumption function --
following Beach and MacKinnon(1978)
kleinar2.tsp text data, benchmark, code
- AR(1) with lagged dependent variable -- multiple optima
and consistent standard errors,
testing for autocorrelation in OLS with a lagged
dependent variable -- Durbin-Watson, Durbin's h, Durbin's m:
electric utility demand, NERC dataset, Berndt (1990)
ar1lag.tsp text data, benchmark, code
ar1lag.wks
ar1lag.xls spreadsheet data
- AR(1) conditional and exact ML, with multiple optima:
Dufour, Gaudry and Liem, also Lovell and Selover [difficult]
dufour.tsp text data, benchmark, code
dufour.wks
dufour.xls spreadsheet data
- unit root testing
- Durbin-Watson test
- univariate ARIMA models (exact ML)
pure AR (see also AR(1) above)
- AR(1) and AR(2) conditional and exact ML:
Klein I consumption function --
following Beach and MacKinnon(1978)
kleinar2.tsp text data, benchmark, code
- AR(2) and AR(3) with constant:
Box-Jenkins series E (sunspots)
bje.tsp text data, benchmark, code
bje.wks
bje.xls spreadsheet data
pure MA
- MA(1) -- actually ARIMA(0,1,1),
Box-Jenkins series A
bja.tsp text data, benchmark, code
(includes general comments on ARIMA benchmarks)
bja.wks
bja.xls spreadsheet data
ma1.tsp handcoded MA(1) LogL
The handcoded version uses:
- MA(2) -- actually ARIMA(0,2,2):
Box-Jenkins series C, 2 subsets, following Osborn (1976)
bjc.tsp text data, benchmark, code
bjc.wks
bjc.xls spreadsheet data
- ARIMA(0,0, 1,1, 1,1) -- multiplicative seasonal MA:
Box-Jenkins series G (monthly airline passengers)
bjg.tsp text data, benchmark, code
bjg.wks
bjg.xls spreadsheet data
- ARIMA(0,0, 1,1, 1,1) -- multiplicative seasonal MA:
Pankratz (1991) series 12 (log KWH), 23 (housing starts),
24 (housing sales) -- follows Newbold, Agiakloglou, and Miller
(1994)
bjnam.tsp text data, benchmark, code
bjnama.wks
bjnambc.wks
bjnama.xls
bjnambc.xls spreadsheet data
mixed ARMA
- ARMA(1,1) with constant (exact ML):
Box-Jenkins series A
bja.tsp text data, benchmark, code
(same as above under MA(1))
bja.wks
bja.xls spreadsheet data
bjacls.tsp (conditional ML -
2 benchmarks)
- ARIMA(2,1,2):
illustrates multiple local optima, for different starting
values.
original data from Campbell and Mankiw - log(Real GNP quarterly)
as given by Perron
follows Newbold, Agiakloglou, and Miller (1994),
who actually used an extended version of these data.
The three local optima NAM found also seem to exist (at
roughly the same parameter values) for the original data.
bjrg.tsp text data, code
other
- Partial AutoCorrelation function -- compares
Yule-Walker, OLS, Burg, and Exact ML methods:
Box-Jenkins series F (chemical yields)
bjfpac.tsp text data, benchmarks, code
bjf.wks
bjf.xls spreadsheet data
- GARCH models
- GARCH(1,1) with constant:
Bollerslev and Ghysels (1996) daily Deutschmark-British Pound
exchange rate
uses TSP 4.4 features as of 5/11/98:
different init options
for h(t) and e(t)**2, iteration with analytic second
derivatives, and QMLE standard errors
bg44.tsp code, 6-digit benchmarks
(for 3 different presample initialization options)
dmbp.dat text data
garch11w.zip
garch11x.zip (zipped) spreadsheet data
- bgfcp.tsp Same model as above, but gives
a full 11-digit solution (for one of the initialization options).
Verified with software from Fiorentini, Calzolari, and Panattoni,
as well as the independent TSP code in bg11 below.
- bg11i.tsp
Same model as above, but reproduces SEs from Information Matrix
and Bollerslev-Wooldridge, as given by FCP software.
Uses g11s.tsp below to compute recursive first derivatives.
- bg11.tsp
GARCH(1,1) with constant
Bollerslev and Ghysels (1996) daily Deutschmark-British Pound
exchange rate
same as bg44.tsp, but can run on earlier versions of TSP
(uses a lot of complicated code to evaluate the analytic
second derivatives by hand).
- Asymmetric EGARCH(1,1) with constant:
Bollerslev and Ghysels (1996) daily Deutschmark-British Pound
exchange rate
uses numeric second derivatives feature released in
TSP 4.5 on 3/30/00
egarch.tsp code, 6-digit benchmarks
(for 2 different presample initialization options)
- IGARCH(1,1) with constant:
Bollerslev and Ghysels (1996) daily Deutschmark-British Pound
exchange rate
uses ARCH command to compute unrestricted derivatives, and
then does the restricted (alpha0=0, alpha1=1-beta1) iterations.
bgi.tsp code, 5-digit benchmarks
Qualitative Dependent Variables
- binary Logit, Probit, Maximum Score Estimation:
Spector and Mazzeo economic education data (32 obs. x 4 vars.),
following Greene (1993, 1997)
greenelp.tsp text data, benchmarks, code
greenelp.wks
greenelp.xls spreadsheet data
- Trinomial Probit:
Daganzo transportation choice data (50 obs. x 4 vars.),
following Bunch (1991)
daganzo.tsp benchmarks, code
for 2 different normalizations of Sigma matrix; comparison
with Bunch and Limdep 7.0 beta (GHK simulator) results
daganzo.txt text data
daganzo.wks spreadsheet data
daganzo.lim Limdep/Nlogit code
- Poisson, Negative Binomial 1 and 2, Ordered Probit:
Doctor Visits model (5190 obs. x 13 vars.),
following
Cameron and Trivedi (1986, 1998)
count.tsp benchmarks, code
counta.zip (zipped) text data
countw.zip
countx.zip (zipped) spreadsheet data
- Poisson on panel data - fixed and random effects,
Patents and R&D model (346 obs. x 18 vars.),
following Cameron and Trivedi (1998)
poispan.tsp benchmarks, code
countpa.zip (zipped) text data
countpw.zip
countpx.zip (zipped) spreadsheet data
- Negative Binomial 1 on panel data - fixed and random effects,
Patents and R&D model (346 obs. x 18 vars.),
following Cameron and Trivedi (1998)
nbpan.tsp benchmarks, code
countpa.zip (zipped) text data
(same as above)
countpw.zip
countpx.zip (zipped) spreadsheet data
- Frontier production function
pure cross section model,
same as Example 9.6.4 in TSP User's Guide
EG1 model from Frontier program (60 obs. x 3 vars.),
following Coelli
fronteg1.tsp benchmarks, code
fronteg1.txt text data
Panel Data Models - Static
- Pooled OLS, Fixed Individual Effects, Random Individual Effects (ML),
Fixed Time Effects, Fixed Individual & Time Effects,
Random Time Effects (ML), Random Individual & Time Effects (ML)
Grunfeld investment data, balanced, 10 firms, 20 years
following Nerlove (2000), Baltagi (1995)
Different versions of Grunfeld data
grfere.tsp benchmarks, code
grunfeld.dat text data
grunfeld.wks
grunfeld.xls spreadsheet data
- Pooled OLS, Fixed Individual Effects, Random Individual Effects (ML),
Random Individual & Time Effects (ML), Fixed Individual and Time
Effects, nested Two-way ML
Datasets and benchmarks from Baltagi, Badi H., "Econometric
Analysis of Panel Data", second edition, 2001.
official book web site
TSP results generally match the book to 3 digits printed;
sometimes there are small differences in standard errors
of the nonlinear models.
TSP results use an updated (10/2003) PANEL(REI,REIT) command
- Coefficients varying by i
Proc Panbi - allows any set of coefficients to vary by i
Efficient computation by sweeping out effects in a loop,
so hundreds or thousands of different coefficients can be
estimated without inverting any large matrices.
Grunfeld investment data (n=10)
grbi.tsp benchmarks, code
- Random Individual Effects plus AR(1) (exact ML)
Grunfeld investment data
follows Baltagi and Li (1991)
grar1rei.tsp code
grar1rei.out TSP results
Although we believe the above results are correct for the
Grunfeld data, it would be preferable to replicate some
published results, such as those cited in Baltagi(2001), p.84.
- Diagonal heteroskedasticity, AR(1) (exact ML), diagonal het
with AR(1)
Grunfeld investment data
grhetar.tsp benchmarks, code
- SUR by firm (since N < T), SUR with AR(1) (exact ML)
Grunfeld investment data
grsurar.tsp benchmarks, code
- Robust (HAC) SEs for OLS coefficients, using diagonal and
block-diagonal patterns for Omega. Most code works for
unbalanced data. Includes Beck-Katz SEs, implemented for
balanced data, and which do not handle conditional heteroskedasticity.
Grunfeld investment data
ghac.tsp code
- Score/LM tests for AR(1) and Random Individual Effects,
robust to local misspecification.
Greene version of Grunfeld investment data (5 firms)
Different versions of Grunfeld data
following Bera et al (2001)
gscore.tsp benchmark, code
grun5.dat text data
gscore10.tsp same tests,
but computed on the complete/original Grunfeld data (10 firms)
glr.tsp equivalent Likelihood Ratio and
Wald tests, on all 10 firms
glr.out full output file for glr.tsp
- LM tests for Random Individual and Time Effects.
Breusch and Pagan (1980).
Grunfeld investment data (full dataset)
following Baltagi (2001)
grbpre.tsp benchmark, code
Panel Data Models - Dynamic
- OLS, Fixed Effects, Random Effects (FGLS and Conditional ML),
Between, Anderson-Hsiao 2SLS (AH-L, AH-D)
Penn World Table, growth model, balanced data
following Nerlove (1999); see full citations in benchmark file
penngrow.tsp benchmarks, code
penngrow.txt text data
penngrow.wks
penngrow.xls spreadsheet data
- Fixed Effects Bias Correction (also does AH-L)
Penn World Table, growth model, balanced data
following Kiviet (1995), Judson and Owen (1996)
lsdvc.tsp benchmarks, code
- OLS, Fixed Effects, Anderson-Hsiao 2SLS (AH-L, AH-D)
following Arellano and Bond (1981) - unbalanced data
arelbond.tsp benchmarks, code
(permission to distribute data pending)
arelbond.txt text data
arelbond.wks
arelbond.xls spreadsheet data
- GMM first-difference model, written by Yoshitsugu Kitazawa
following Arellano and Bond (1981) - unbalanced data
DPD page benchmarks, code
Random Number Generation
- Checks the new uniform generator in TSP 4.5,
by summing the first 10,000,000 variates.
The answer matches that given by L'Ecuyer (1999).
rantst.tsp
If you have comments on these benchmarks, please send
email to Clint Cummins:
clint@leland.stanford.edu.