National Longitudinal Surveys

NLS79 User Guides and Documentations

Using and Understanding the Data

Information on survey instruments, variable types, the interviewing process, item nonresponse, sample weights and design effects, data documentation, and how to access the data are available below.  To learn more about the cohort, please see the NLSY79 Data Overview


Survey Instruments

A unique set of survey instruments has been used during each survey year to collect information from respondents. The term "survey instrument" is used to refer to:

  • the questionnaires that serve as the primary source of information on a given respondent
  • questionnaire supplements fielded during select survey years that contain additional sets of questions
  • documents such as the household interview forms or household record cards that collect information on members of each respondent's household

View Survey Instruments.

Types of Variables

There are six types of variables present in the NLSY79 data. Some are the raw answers provided by the respondent, while others are constructed. Types of variables include:

  • Direct (or raw) responses from a questionnaire or other survey instrument
  • Edited variables constructed from raw data according to consistent and detailed sets of procedures, such as occupational codes, KEY variables, and so forth
  • Constructed variables based on responses to more than one data item, either cross-sectionally or longitudinally, and edited for consistency where necessary, such as variables on the NLSY79 Supplemental Fertility File (''Fertility and Relationship History/Created'' area of interest in NLS Investigator)
  • Constructed variables from other sources, such as the County & City Data Book information present on the NLSY79 Geocode data files
  • Variables provided by an outside organization based on sources not directly available to the user, such as the high school survey and transcript data, scores from the Armed Services Vocational Aptitude Battery, and so forth
  • Data collected from or about one universe of respondents reconstructed with a second universe as the unit of observation, such as variables on the NLSY79 Child File

The type of variable impacts the title or variable description naming each variable, the physical placement of each variable within the codebook, and the the location of a variable within a given area of interest.  View Types of Variables.

Sample Weights and Clustering Adjustments

In each survey year a set of sampling weights is constructed. These weights provide the researcher with an estimate of how many individuals in the United States each respondent's answers represent. Weighting decisions for the NLSY79 are guided by the following principles:

  • individual case weights are assigned for each year in such a way as to produce group population estimates when used in tabulations
  • the assignment of individual respondent weights involves at least three types of adjustment, with additional considerations necessary for weighting of NLSY79 Child data

The interested user should consult the NLSY79 Technical Sampling Report (Frankel, Williams, and Spencer 1983) for a step-by-step description of the adjustment process. If users need longitudinal weights for multiple survey years or for a specific set of respondent ids, they can create custom weights by going to the NLSY79 Custom Weighting page.  View Sample Weights and Clustering Adjustments.

Standard Errors and Design Effects

This section contains information on standard errors and design effects for the NLSY79 sample, briefly discussing how to use these two statistical factors. It then includes tables for the first round and for 1996 through the most recent survey. Users interested in the intervening years should review the NLSY79 Technical Sampling Report and Technical Sampling Report Addendum.

Standard errors have been explicitly computed for a number of statistics based upon the entire NLSY79 sample (total, civilian, and military) and a number of sex or race subclasses. Standard errors for other statistics (defined over the entire sample or the subclasses) may be approximated with use of the DEFT factors given in the linked tables.  View Standard Errors and Design Effects.

Interview Remarks

Each NLSY79 questionnaire includes an interviewer remarks section that interviewers complete after finishing the interview with the respondent. Some of the information is objective (the presence of another person during an in-person survey, for instance) while other information is subjective on the part of the interviewer (such as rating how cooperative the respondent was). View Interview Remarks.

Item Nonresponse

This section examines and quantifies the extent of missing data, formally called item nonresponse, in the NLSY79.  All missing data are clearly flagged in the NLSY79 data set. Five negative numbers are used to indicate to users that the variable does not contain useful information. The five values are (-1) refusal, (-2) don't know, (-3) invalid skip, (-4) valid skip, and (-5) noninterview. These five numbers are reserved as missing value flags and, with a few exceptions (see Appendix 5), are rarely used in the NLSY79 for valid data values.  View Item Nonresponse.

NLS Investigator

NLSY79 Documentation

The first supplement, the NLSY79 Codebook Supplement, contains a series of attachments and appendices, variable creation procedures, supplementary coding categories, and derivations for selected variables on the main NLSY79 data files. Information provided within this document is not available in the NLSY79 codebooks, nor will it be found on the documentation files on the NLSY79 data sets. The other supplement contains comparable information specific to the NLSY79 Geocode data files. The Technical Sampling Report describes the selection of the NLSY79 sample and provides additional statistical information. Finally, the School & Transcript Surveys Documentation provides technical information about those special data collections.  Errata can also be accessed by following links for the cohort of interest.  View NLSY79 Documentation.


Last Modified Date: April 24, 2020