Handbook of Methods Occupational Employment and Wage Statistics Design

Occupational Employment and Wage Statistics: Design

This is an archived page. To see the latest version, please visit Occupational Employment and Wage Statistics: Design.

The Occupational Employment Statistics (OES) survey is based on a probability sample drawn from a universe of about 7.9 million in-scope establishments stratified by geography, industry, size, and ownership. The sample is designed to represent all nonfarm establishments in the United States.

Semiannual samples are referred to as panels. The survey is conducted over a rolling 6-panel (or 3-year) cycle. This is done to provide adequate geographic, industrial, and occupational coverage. Over the course of a 6-panel (or 3-year) cycle, approximately 1.1 million establishments are sampled. To the extent possible, private sector units selected in any one panel are not sampled again in the next five panels. For example, in a cycle, data collected in May 2019 are combined with data collected in November 2018, May 2018, November 2017, May 2017, and November 2016.

A probability sample is taken of local government establishments, private sector establishments, and state schools and hospitals.

Frame construction

The sampling frame, or universe, is a list of about 7.9 million in-scope nonfarm establishments that file unemployment insurance (UI) reports to the state workforce agencies. Employers are required by law to file these reports to the state where each establishment is located. Every quarter, the U.S. Bureau of Labor Statistics (BLS) creates a national sampling frame by combining the administrative lists of unemployment insurance reports from all of the states into a single database called the Quarterly Census of Employment and Wages (QCEW). Every 6 months, OES extracts the administrative data of establishments that are in scope for the OES survey from the most current QCEW. QCEW files were supplemented with frame files covering establishments in Guam and the rail transportation industry (NAICS 4821) because these are outside the UI program’s scope.

Construction of the sampling frame includes a process in which establishments that are linked together into multiunit companies are assigned to either the May or November sample. This prevents BLS from contacting multiunit companies more than once per year for this survey. Furthermore, the frame is matched to the 5 prior sample panels, and units that have been previously selected in the 5 prior panels are marked as ineligible for sampling for the current panel.

Stratification

Establishments in the sampling frame are stratified by geographic area, industry group, ownership, and size.

Geography. There are 606 Metropolitan Statistical Areas (MSAs) and nonmetropolitan or Balance-of-State (BOS) areas specified. MSAs are defined and mandated by the Office of Management and Budget. Each officially defined metropolitan area within a state is specified as a substate area. Cross-state MSAs have a separate portion for each state contributing to that MSA. In addition, states may have up to six residual nonmetropolitan areas that together cover the remaining non-MSA portion of their state.

Industry. There are 302 industry groups defined at the NAICS 3-, 4-, 5-, or 6-digit level.

Ownership. Schools are stratified by state government, local government, or private ownership. Also, local government casinos and gambling establishments are sampled separately from the rest of local government.

Size. Establishments are divided into certainty and noncertainty size classes.

At any given time, there are about 150,000 nonempty State/MSA-BOS/NAICS 3-, 4-, 5-, 6-digit/ownership strata on the frame. When comparing nonempty strata between frames, there may be substantial frame-to-frame differences. The differences are due primarily to normal establishment birth and death processes and normal establishment growth and shrinkage. Other differences are due to establishment NAICS reclassification and changes in geographic location.

A small number of establishments indicate the state in which their employees are located, but do not indicate the specific county in which they are located. These establishments are also sampled and used in the calculation of the statewide and national estimates. They are not included in the estimates of any substate area. Therefore, the sum of the employment in the MSAs and nonmetropolitan areas within a state may be less than the statewide employment.

Sample size

The combined sample for the May 2019 survey is the equivalent of six panels. The sample was reduced in recent panels. To the extent possible, private sector units selected in any one panel are not sampled again in the next five panels. The sample allocations, excluding federal government and U.S. Postal Service (USPS), for the panels in this cycle are:

182,809 establishments for May 2019

186,679 establishments for November 2018

186,125 establishments for May 2018

185,450 establishments for November 2017

195,117 establishments for May 2017

201,952 establishments for November 2016

The May 2019 data include a census of 8,094 federal and USPS units. The combined sample size for the May 2019 estimates is approximately 1.1 million establishments, which includes only the most recent data for federal and state government. Federal and state government units from older panels are deleted to avoid double counting.

Allocation methods

The sampling frame is stratified into approximately 150,000 nonempty State/MSA-BOS/NAICS 3-, 4-, 5-, 6-digit/ownership strata. Each time a sample is selected, a 6-panel allocation of the 1.1 million sample units among these strata is performed. The largest establishments are removed from the allocation because they will be selected with certainty once during the 6-panel cycle. For the remaining noncertainty strata, a set of minimum sample size requirements based on the number of establishments in each cell is used to ensure coverage for industries and MSAs. For each State/MSA-BOS/NAICS 3-, 4-, 5-, 6-digit/ownership stratum, a sample allocation is calculated using a power Neyman allocation.¹ The actual 6-panel sample allocation is the larger of the minimum sample allocation and the power allocation. To determine the current single panel allocation, the 6-panel allocation is divided by 6, and the resulting quotient is randomly rounded.

Two factors influence the power Neyman allocation. One is the square root of the employment size of each stratum. With a Neyman allocation, strata with higher levels of employment generally are allocated more samples than strata with lower levels of employment. Using the square root within the Neyman allocation softens this effect. The other is a measure of the occupational variability of the industry based on prior OES survey data. The occupational variability of an industry is measured by computing the coefficient of variation (CV) for each occupation within the 90th percentile of occupational employment in a given industry, averaging those CVs, and then calculating the standard error from that average CV. Using this measure, industries that tend to have greater occupational variability will get more sample than industries that are more occupationally homogeneous.

Sample selection

Sample selection within strata is approximately proportional to size. In order to provide the most occupational coverage, establishments with higher employment are more likely to be selected than those with lower employment; some of the largest establishments are selected with certainty. The unweighted employment of sampled establishments makes up approximately 57.2 percent of total employment.

Permanent random numbers (PRNs) are used in the sample selection process. To minimize sample overlap between the OES survey and other large surveys conducted by BLS, each establishment is assigned a PRN. For each stratum, a specific PRN value is designated as the “starting” point to select a sample. From this “starting” point, we sequentially select the first ‘n’ eligible establishments in the frame into the sample, where ‘n' denotes the number of establishments to be sampled.

Sampling weights

Sampling weights are computed so that each panel will roughly represent the entire universe of establishments.

Federal government, USPS, and state government units are assigned a panel weight of 1. Other sampled establishments are assigned a design-based panel weight, which reflects the inverse of the probability of selection.

Notes

¹ The Power Neyman allocation is a statistical method of balancing the efficiency of the overall estimate with the efficiency of subnational estimates. For more information, see “Power Allocations: Determining Sample Sizes for Subnational Areas,” Michael D. Bankier, The American Statistician, vol. 42, no. 3 (Aug., 1988), pp. 174–177.

Last Modified Date: August 31, 2020