Handbook of Methods Occupational Employment and Wage Statistics Design

Occupational Employment and Wage Statistics: Design

The Occupational Employment and Wage Statistics (OEWS) survey is based on a probability sample drawn from a universe of about 8.7 million in-scope business establishments stratified by geography, industry, size, and ownership. The sample is designed to represent all nonfarm establishments in the United States.

The full OEWS survey sample is collected over a 6-panel (or 3-year) cycle in order to provide adequate geographic, industry, and occupational coverage. Each year, the OEWS program collects 2 samples of survey data, each consisting of approximately 186,000 to 189,000 establishments. These semiannual samples are referred to as “panels.” Data are collected semiannually to help reduce seasonal effects. Respondents are asked to provide data as of their payroll that includes May 12 or November 12, depending on the panel in which they are sampled.

Over the course of a full 6-panel cycle, approximately 1.1 million establishments are sampled. For example, data collected in May 2024 were combined with data collected in November 2023, May 2023, November 2022, May 2022, and November 2021 to produce the May 2024 OEWS estimates, for a total sample size of approximately 1.1 million units.

Of the approximately 1.1 million establishments in the 50 states and the District of Columbia in the May 2024 combined initial sample, approximately 1,063,000 were viable establishments (that is, establishments that are not outside the scope or out of business). Of the viable establishments, approximately 698,000 responded and 365,000 did not, yielding a 65.7-percent response rate. The response rate in terms of weighted sample employment is 65.9 percent.

A probability sample is taken of private-sector establishments, local government establishments, and state government schools and hospitals. OEWS receives an annual census of employees in the federal executive branch, U.S. Postal Service (USPS), Tennessee Valley Authority (TVA), and state government establishments (excluding state government schools and hospitals). A census of Hawaii’s local government establishments, excluding schools and hospitals, is also conducted each November. For these establishments for which census data are received, only the most recent census is used in each year’s estimates. Units from older panels are deleted to avoid double counting.

Frame construction

The sampling frame, or universe, is a list of about 8.7 million in-scope nonfarm establishments that file unemployment insurance (UI) reports to the state workforce agencies. Employers are required by law to file these reports to the state where each establishment is located. Every quarter, the U.S. Bureau of Labor Statistics (BLS) creates a national sampling frame by combining the administrative lists of unemployment insurance reports from all of the states into a single database called the Quarterly Census of Employment and Wages (QCEW). Every 6 months, OEWS extracts the administrative data of establishments that are in scope for the OEWS survey from the most current QCEW. QCEW files were supplemented with frame files covering establishments in Guam and the rail transportation industry (NAICS 4821) because these are outside the UI program’s scope.

Construction of the sampling frame includes a process in which establishments that are linked together into multiunit companies are assigned to either the May or November sample. This prevents BLS from contacting multiunit companies more than once per year for this survey. Furthermore, the frame is matched to the five prior sample panels, and units that have been previously selected in the five prior panels are marked as ineligible for sampling for the current panel.

Stratification

Establishments in the sampling frame are stratified by geographic area, industry group, ownership, and size. Stratification is done at the state; metropolitan or nonmetropolitan area; 3-, 4-, 5-, or 6-digit NAICS; and ownership level.

Geography

There are over 580 metropolitan statistical areas (MSAs) and nonmetropolitan or balance-of-state (BOS) areas specified. MSAs are defined and mandated by the Office of Management and Budget. Each officially defined metropolitan area within a state is specified as a substate area. MSAs that cross state borders have a separate portion for each state contributing to that MSA. In addition, states may have up to six residual nonmetropolitan areas that together cover the remaining non-MSA portion of their state.

Industry

There are about 300 industry groups defined at the NAICS 3-, 4-, 5-, or 6-digit level.

Ownership

Schools and hospitals are stratified by state government, local government, or private ownership. Local government casinos and gambling establishments are sampled separately from the rest of local government.

Size

Sampled establishments are separated into two types based on employment size: certainty units and noncertainty units. Large employers are selected into the sample with certainty over the 3-year sample cycle. A probability sample is taken of smaller establishments with employment below the certainty size cutoff.

At any given time, there are about 145,000 nonempty strata on the frame. When comparing nonempty strata between frames, there may be substantial frame-to-frame differences. The differences are primarily due to normal establishment birth and death processes and normal establishment growth and shrinkage. Other differences are caused by changes in establishments’ NAICS classifications or geographic locations.

A small number of establishments provide the state in which their employees are located, but do not provide the specific county in which they are located. These establishments are also sampled and used in the calculation of the statewide and national estimates. They are not included in the estimates of any substate area. Therefore, the sum of the employment in the MSAs and nonmetropolitan areas within a state may be less than the statewide employment.

Sample size

The combined 6-panel sample allocations, excluding all federal government census data, for the May 2024 estimates are as follows:

188,236 establishments for May 2024
186,410 establishments for November 2023
188,045 establishments for May 2023
186,064 establishments for November 2022
186,911 establishments for May 2022
187,215 establishments for November 2021

The May 2024 data include a census of about 6,300 federal executive branch, TVA, and USPS units. The combined sample size for the May 2024 estimates is approximately 1.1 million establishments, which includes only the most recent data for federal and state government.

Allocation methods

The sampling frame is stratified into approximately 145,000 nonempty strata. Each time a sample is selected, a 6-panel allocation of the 1.1 million sample units among these strata is performed.

The largest establishments are removed from the allocation because they will be selected with certainty once during the 6-panel cycle. Once the certainty units have been removed, the probability sample of noncertainty units is allocated across strata. For the remaining nonempty strata, a set of minimum sample size requirements based on the number of establishments in each cell is used to ensure sufficient coverage by industry and geographic area. For each stratum, a sample allocation is also calculated using a power Neyman allocation.^¹ The actual 6-panel sample allocation is the larger of the minimum sample allocation and the power Neyman allocation. To determine the current single panel allocation, the 6-panel allocation is divided by 6, and the resulting quotient is randomly rounded.

Two factors influence the power Neyman allocation. The first is the square root of the employment size of each stratum. With a Neyman allocation, strata with higher levels of employment generally are allocated more sample units than strata with lower levels of employment. Using the square root within the Neyman allocation softens this effect. The second factor is a measure of the occupational variability of the industry based on prior OEWS survey data. The occupational variability of an industry is measured by computing the coefficient of variation (CV) for each occupation (excluding the least common occupations in each industry), averaging those CVs, and then calculating the standard error from that average CV. Using this measure, industries that tend to have greater occupational variability will get more sample units than industries that are more occupationally homogeneous.

Sample selection

To provide the most occupational coverage, sample selection within strata is approximately proportional to size. Some of the largest establishments are selected with certainty. Among the noncertainty units, establishments with higher employment are more likely to be selected than those with lower employment. The unweighted employment of sampled establishments makes up approximately 55 percent of total employment in May 2024.

Permanent random numbers (PRNs) are used in the sample selection process. To minimize sample overlap between the OEWS survey and other large surveys conducted by BLS, each establishment is assigned a PRN. For each stratum, a specific PRN value is designated as the “starting” point to select a sample. From this starting point, we sequentially select the first ‘n’ eligible establishments in the frame into the sample, where n denotes the number of establishments to be sampled.

Sampling weights

Sampling weights are computed so that each panel will roughly represent the entire universe of establishments.

Federal government, USPS, TVA, and state government units are assigned a sample weight of 1. Other sampled establishments are assigned a design-based sample weight, which reflects the inverse of the probability of selection.

Notes

¹ The power Neyman allocation is a statistical method of balancing the efficiency of the overall estimate with the efficiency of subnational estimates. For more information, see Michael D. Bankier, “Power allocations: Determining sample sizes for subnational areas,” The American Statistician, vol. 42, no. 3 (August, 1988), pp. 174–177.

Last Modified Date: December 30, 2025