Department of Labor Logo United States Department of Labor
Dot gov

The .gov means it's official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Handbook of Methods Survey of Occupational Injuries and Illnesses Design

Survey of Occupational Injuries and Illnesses: Design

A two-stage process is used to select a sample from which estimates are generated for the Survey of Occupational Injuries and Illnesses (SOII). The first stage involves the selection from a frame that is compiled from multiple sources, primarily the Quarterly Census of Employment and Wages (QCEW). The frame includes all in-scope establishments that will be required to participate in SOII (i.e., sample units). The units are selected to create a stratified sample that takes into account industry, ownership, and establishment size. The second stage is the selection of sample cases involving days away from work and sample cases involving job transfer or work restriction from the establishments that have been selected. All cases involving days away from work are collected from most establishments. However, as a way to reduce respondent burden, establishments that have a large number of cases involving days away from work are instructed to report a subsample of their cases that occurred in specified time periods. Cases involving job transfer or work restriction are collected from establishments in select industries.

Because SOII is a federal–state cooperative program and the data are designed to meet the needs of the states,1 an independent sample is selected for each participating state or U.S. territory.2 The sample is selected to represent all in-scope private industries, state government, and local government. The sample size for SOII is dependent upon the:

  •    number and kind of cases for which estimates are needed
  •    industries for which estimates are desired
  •    characteristics of the population being sampled
  •    target reliability of the estimates
  •    survey design employed.

One criterion of the SOII design is identifying target estimation industries (TEIs). TEIs, which are selected by each state, are North American Industry Classification System (NAICS) industries or groups of industries for which a state wishes to produce an estimate. For example, a state may select to target estimates for hospitals (NAICS 622). This TEI would include establishments in general medical and surgical hospitals (NAICS 622110), psychiatric and substance abuse hospitals (NAICS 622210), and specialty hospitals, except psychiatric and substance abuse (NAICS 622310). A sampling cell is defined by state, ownership, TEI, and size class for which an estimate will be tabulated. Size classes are based on an establishment’s average annual employment, as defined below:

  •    Size class 1 = establishments with 1–10 employees
  •    Size class 2 = establishments with 11–49 employees
  •    Size class 3 = establishments with 50–249 employees
  •    Size class 4 = establishments with 250–999 employees
  •    Size class 5 = establishments with 1,000 or more employees

In SOII, the variability of the incidence rate for total recordable cases (TRC) of injuries and illnesses is used as the primary variable for determining allocation of the sample, because there is a high correlation between these cases and other important characteristics of the data being estimated. Historical TRC rates by state are used to calculate the variance. The optimal allocation procedure distributes the sample to the industries in a manner intended to minimize the variance of the total number of recordable cases in the universe or, alternatively, the incidence rate of recordable cases in the universe. In strata with higher variability of the data, a larger sampling is selected. For some sampling cells, it is necessary to select all frame units in the cell in order to meet minimum sampling requirements or to ensure that an adequate number of units are sampled to produce accurate and reliable estimates for the cell.

Once sampling is complete and all necessary reviews and adjustments have been made, sampling weights are calculated for units selected in each sampling cell. A maximum weight threshold is applied to sample units. Sampling weights are calculated by dividing the number of frame units in the sampling cell by the number of sample units in that cell as follows:

  Sample weight = N u n s formula


NU = the number of frame units available for selection in the sampling cell
nS = the number of units sampled.

For example, if there are 100 frame units in a sampling cell from which 5 units are selected for the sample, then the weight assigned to each of the sample units would be 100 divided by 5, or 20.


1 Contact information for SOII state partners is available at

2 Data for nonparticipating states are collected by BLS regional staff. Data for these states are used in tabulation of national estimates; however, state-level estimates are not available separately for nonparticipating states.

Last Modified Date: July 16, 2018