HIST produces histograms (bar charts or frequency distributions) of series. It is convenient for obtaining a rough picture of the univariate distribution of your data. The graphics version (options NOPRINT, PREVIEW) is the default for TSP/Oxmetrics.
HIST (CDF, DENSITY, DISCRETE, HIST, MAX=<maximum X value>, MIN=<minimum X value>, NBINS=<number of bins>, NORMAL, PRINT, PREVIEW, STANDARD, TITLE=<text string>) <list of series> ;
Usage
Follow HIST with the names of one or more series for which you would like to see a frequency distribution. The default options for output yield a histogram with ten equally spaced bins or cells running from the minimum value of the series to the maximum value.
Plots of the histogram for each series are produced, each in a separate window.
The following is stored in data storage:
|
variable |
type |
length |
description |
|
@HIST |
matrix |
#nbins*#series |
matrix with observation counts for each series |
|
@HISTVAL |
matrix |
#nbins*#series |
matrix with bin lower bounds for each histogram |
CDF/NOCDF includes a normal QQ plot below the histogram.
DENSITY/NODENSITY superimposes a smooth density on the histogram.
DISCRETE/NODISCRE specifies whether the series are discrete or continuous. If the series are discrete, there will be one cell for each unique value (limited by NBINS). Cells with zero counts will be omitted
HIST/NOHIST specifies whether to include a bar type histogram in the printout.
MAX= upper bound on the last cell. The default is the maximum value of the series.
MIN= lower bound on the first cell. The default is the minimum value of the series.
NBINS= the number of bins or cells (The default is 10 for NODISCRETE and 20 for DISCRETE).
NORMAL/NONORMAL superimposes a normal density on the histogram.
PRINT/NOPRINT tells whether the histogram is to be printed or just stored.
STANDARD/NOSTANDARD standardizes the data before plotting.
TITLE= 'title string' labels the plot.
HIST X ;
produces a plot with the vertical axis containing ten cells running from the minimum value of X to the maximum value of X, and the horizontal axis showing the number of observations of X which take on values within each of the cells.
HIST (MAX=100,MIN=0,NBINS=40) Y1 Y2 ;
produces two histograms, each with 40 cells and a width equal to 2.5. The fraction of observations of Y1 (or Y2) which fall in each cell are shown.
Suppose the variable REASON takes on the values 0,1,2, and 3, with the following counts:
|
Value |
Number of observations |
|
0 |
10 |
|
1 |
0 |
|
2 |
34 |
|
3 |
42 |
The command
HIST (DISCRETE) REASON ;
will produce a histogram with three cells, containing the number of observations taking on the values of REASON = 0, 2, 3.
On the other hand, the command
HIST REASON ;
will default to the INTEGER mode and produce a histogram with four cells, containing the number of observations taking on the four values of REASON.
If SIZE is a variable containing the log sales or employment of a cross section of firms,
HIST (DENSITY,TITLE="Size distribution")
produces a graph of the size distribution for the firms with a smooth approximation to the density superimposed.