# Econometric analysis of longitudinal data

*James Heckman* and
*Burton Singer*

Chapter 29 in *Handbook of Econometrics*, 1986, vol. 3, pp 1689-1763 from Elsevier

**Abstract:**
This paper considers the formulation and estimation of continuous time social science duration models. The focus is on new issues that arise in applying statistical models developed in biostatistics to analyze economic data and formulate economic models. Both single spell and multiple spell models are discussed. In addition, we present a general time inhomogeneous multiple spell model which contains a variety of useful models as special cases.Four distinctive features of social science duration analysis are emphasized:(1) Because of the limited size of samples available in economics and because of an abundance of candidate observed explanatory variables and plausible omitted explanatory variables, standard nonparametric procedures used in biostatistics are of limited value in econometric duration analysis. It is necessary to control for observed and unobserved explanatory variables to avoid biasing inference about underlying duration distributions. Controlling for such variables raises many new problems not discussed in the available literature.(2) The environments in which economic agents operate are not the time homogeneous laboratory environments assumed in biostatistics and reliability theory. Ad hoc methods for controlling for time inhomogeneity produce badly biased estimates.(3) Because the data available to economists are not obtained from the controlled experimental settings available to biologists, doing econometric duration analysis requires accounting for the effect of sampling plans on the distributions of sampled spells.(4) Econometric duration models that incorporate the restrictions produced by economic theory only rarely can be represented by the models used by biostatisticians. The estimation of structural econometric duration models raises new statistical and computational issues.Because of (1) it is necessary to parameterize econometric duration models to control for both observed and unobserved explanatory variables. Economic theory only provides qualitative guidance on the matter of selecting a functional form for a conditional hazard, and it offers no guidance at all on the matter of choosing a distribution of unobservables. This is unfortunate because empirical estimates obtained from econometric duration models are very sensitive to assumptions made about the functional forms of these model ingredients.In response to this sensitivity we present criteria for inferring qualitative properties of conditional hazards and distributions of unobservables from raw duration data sampled in time homogeneous environments; i.e. from unconditional duration distributions. No parametric structure need be assumed to implement these procedures.We also note that current econometric practice overparameterizes duration models. Given a functional form for a conditional hazard determined up to a finite number of parameters, it is possible to consistently estimate the distribution of unobservables nonparametrically. We report on the performance of such an estimator and show that it helps to solve the sensitivity problem.We demonstrate that in principle it is possible to identify both the conditional hazard and the distribution of unobservables without assuming parametric functional forms for either. Tradeoffs in assumptions required to secure such model identification are discussed. Although under certain conditions a fully nonparametric model can be identified, the development of a consistent fully nonparametric estimator remains to be done.We also discuss conditions under which access to multiple spell data aids in solving the sensitivity problem. A superficially attractive conditional likelihood approach produces inconsistent estimators, but the practical significance of this inconsistency is not yet known. Conditional inference schemes for eliminating unobservables from multiple spell duration models that are based on sufficient or ancillary statistics require unacceptably strong assumptions about the functional forms of conditional hazards and so are not robust. Contrary to recent claims, they offer no general solution to the model sensitivity problem.The problem of controlling for time inhomogeneous environments (Point (2)) remains to be solved. Failure to control for time inhomogeneity produces serious biases in estimated duration models. Controlling for time inhomogeneity creates a potential identification problem.For a single spell data it is impossible to separate the effect of duration dependence from the effect of time inhomogeneity by a fully nonparametric procedure. Although it is intuitively obvious that access to multiple spell data aids in the solution of this identification problem, the development of precise conditions under which this is possible is a topic left for future research.We demonstrate how sampling schemes distort the functional forms of sample duration distributions away from the population duration distributions that are the usual object of econometric interest (Point (3)). Inference based on misspecified duration distributions is in general biased. New formulae for the densities of commonly used duration measures are produced for duration models with unobservables in time inhomogeneous environments. We show how access to spells that begin after the origin date of a sample aids in solving econometric problems created by the sampling schemes that are used to generate economic duration data.We also discuss new issues that arise in estimating duration models explicitly derived from economic theory (Point (4)). For a prototypical search unemployment model we discuss and resolve new identification problems that arise in attempting to recover structural economic parameters. We also consider nonstandard statistical problems that arise in estimating structural models that are not treated in the literature. Imposing or testing the restrictions implied by economic theory requires duration models that do not appear in the received literature and often requires numerical solution of implicit equations derived from optimizing theory.

**JEL-codes:** C39 (search for similar items in EconPapers)

**Date:** 1986

**References:** Add references at CitEc

**Citations:** View citations in EconPapers (16) Track citations by RSS feed

**Downloads:** (external link)

http://www.sciencedirect.com/science/article/B7GX7 ... 00a2ae5a6636c338e7cb

Full text for ScienceDirect subscribers only

**Related works:**

This item may be available elsewhere in EconPapers: Search for items with the same title.

**Export reference:** BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text

**Persistent link:** https://EconPapers.repec.org/RePEc:eee:ecochp:3-29

Access Statistics for this chapter

More chapters in Handbook of Econometrics from Elsevier

Bibliographic data for series maintained by Dana Niculescu ().