Occupational Requirements Survey Pre-Production Estimation and Validation Report
September 10, 2015
In fiscal year 2015, the Bureau of Labor Statistics (BLS) completed data collection for the Occupational Requirements Survey (ORS) pre-production test. The pre-production test might better be described as a “dress rehearsal” as the collection procedures, data capture systems, and review were structured to be as close as possible to those that will be used in production.1 The feasibility tests in FY 2014 and earlier were intended to gauge the viability of collecting occupational data elements and to test modes of collection and procedures and BLS integrated the results of this prior work into the large-scale nationally representative pre-production test.
This report is a companion piece to the “Occupational Requirements Survey Pre-Production Collection Report,” posted to the BLS website in June 2015.2 The purpose of the earlier report was to provide information on the collection and review of the pre-production data, while the intent of this report is to discuss estimation, validation, and plans for dissemination of the pre-production data. BLS issued the earlier report soon after the close of data collection and review. As some data may have changed during the estimation and validation processes, the numbers provided in this report may differ from the earlier report.
Initial work on estimation and validation of the data indicate that BLS is able to successfully produce estimates of occupational requirements that will meet the needs of the Social Security Administration (SSA) as an input into their Occupational Information System (OIS). The OIS will replace the Dictionary of Occupational Titles in SSA’s disability adjudication process.3 Though the pre-production sample was relatively small, over 20,000 estimates were calculated and validated. These estimates span 21 2-digit SOCs and 48 8-digit SOCs and provide information about the requirements of civilian jobs. We continue to refine collection procedures, definitions, and training to ensure that collection of the ORS data elements will lead to high quality estimates as we transition into full-scale production of the Occupational Requirements Survey.
Background and Pre-production Test Overview
In the summer of 2012, the SSA and BLS signed an interagency agreement, which has been updated annually, to begin the process of testing the collection of data on occupational requirements. As a result, BLS established ORS as a test survey in late 2012. The goal of ORS is to collect and publish occupational information that meets the needs of SSA at the level of the eight-digit standard occupational classification (SOC) that is used by the Occupational Information Network (O*NET).4
The ORS data are collected under the umbrella of the National Compensation Survey (NCS), which uses Field Economists (FEs) to collect data. FEs generally collect data elements through either a personal visit to the establishment or remotely via telephone, email, mail, or a combination of modes.
For ORS, FEs are collecting occupationally-specific data elements to meet SSA’s needs in the following categories:
- Physical demands
- Specific vocational preparation (SVP)
- Mental and cognitive demands
- Environmental conditions in which the work is performed.
There were roughly 70 ORS-specific data elements in all, with the majority of these falling in the category of physical demands.
ORS pre-production data collection began in October 2014 and continued until May 2015. At the close of the data review process, information on 7,109 quotes or jobs had been collected from 1851 establishments, slightly less than 4 jobs per establishment. These jobs spanned all 22 unique 2-digit SOCs in scope for ORS and 704 unique 8-digit SOCs.5 The 704 8-digit SOCs represent 63.4 percent of the 1110 unique 8-digit SOCs.
This report summarizes the process for calculating and validating the estimates from the pre-production test and briefly addresses plans for disseminating these estimates.6 This report does not contain actual estimates based on the ORS pre-production data. These are expected to be released in early FY 2016 in a BLS-published article.
Of the 2,549 establishments contacted by field economists, 168 were either out of business, out of scope, or had no jobs in scope for ORS. Of the remaining 2,381 establishments, 1,851 of them provided usable data, indicating a usable establishment response rate of 78 percent.7
The quote-level response rate was 85 percent, with 15 percent refusals. Approximately 2 percent of the quotes reported no jobs within scope and are excluded from the response rates. Response rates on individual elements varied substantially. Among physical demand elements, the response rate was 84 percent; the response rate was 85 percent for mental and cognitive demands, 86 percent for specific vocational preparation, and 92 percent for environmental conditions.
While there were roughly 70 ORS-specific elements collected in pre-production, there are many more estimates that correspond to these elements.8 For categorical elements, the percentage of workers associated with each category are estimated. For example, the responses for encountering noise are quiet, moderate, loud, and very loud and percentage of workers will be calculated for each of these. For continuous variables, mean, mode and percentiles (10th, 25th, 50th, 75th, and 90th) are being estimated. An example of this would be time spent standing (measured as a duration in hours). Additionally, some ORS estimates are calculated from multiple ORS elements. SVP, for example, is calculated based on education, experience, certification/licensing, and post-employment training.
If we wished to produce only one set of estimates for the roughly 70 ORS-specific elements, for example, national estimates for civilian workers, then 690 estimates would be produced.9 However, the goal is to produce not just “top line” estimates, but estimates on more detailed subgroups, including by 2-digit and 8-digit O*NET-SOC. These series of estimates are:
- Civilian, all workers
- Civilian, 2-digit SOC groups (22)
- Civilian, 8-digit SOC (1,090)
Producing estimates at this level of detail results in potential estimates in the millions, as reflected in the table below.
Table 1: Potential estimates
|Series||Potential Series||Potential Estimates|
In order to produce any estimates, however, there are a number of steps that BLS uses. Sample weights are assigned to each establishment and occupation and then adjusted for non-response. The weights are then benchmarked to account for current employment. Then estimates are produced and reviewed for reliability and confidentiality before they are published.
Table 2 builds on Table 1 to give a sense of which estimates have passed these criteria from the pre-production test.
Table 2: Potential estimates and pre-production estimates
|Series||Potential Series||Potential Estimates||Series with no collected data in pre-production||Series available for calculation||Number of estimates passing initial criteria|
Three-quarters of a million estimates would be possible with full coverage of 8-digit SOCs.The pre-production test, with 7,109 unique quotes collected from 1851 establishments, had a relatively small sample, so it is not surprising that the actual number of estimates passing BLS’s criteria is considerably fewer than the potential number of estimates.
Turning first to the 2-digit SOC series, column 4 of Table 2 shows that there were no series without collected data in pre-production. This means at least one quote was obtained in each 22 2-digit SOC group. However, column 5, “series available for calculation,” shows that only 21 of the 22 SOCs were used to generate estimates. While data were obtained for all 2-digit SOCs, there were not enough quotes obtained in one of the SOCs (farming, fishing, and forestry occupations) to produce even one estimate for this group. In theory, 15,180 estimates could have been produced for these 21 2-digit SOC groups, in actuality, 7,844 estimates passed all BLS criteria. The failure to meet BLS criteria often is attributable to too few observations available for a particular estimate, which may be due to too few quotes in a particular SOC group, how the data are clustered within a SOC, or to item-level non-response to particular data elements. For example, although there are five categories of responses for encountering noise, it may be the case that workers in a given occupation are never exposed to noise at all five levels.
There were 387 8-digit SOCs with no inputs from pre-production. Among the 704 SOCs for which there was at least one quote collected during pre-production, only 48 met the initial BLS estimate criteria. These 48, however, include occupations that we anticipate will be of high interest to potential ORS users, including:
- Heavy and tractor trailer truck drivers
- Food prep workers
- Janitors and cleaners
- Laborers and freight, stock, and material movers
- Nursing assistants
- Waiters and waitresses
As with the 2-digit SOCs, it is important to note that we do not have a full set of 690 estimates for each of the 48 8-digit SOCs. If we had 690 estimates for each of the 48, there would be 33,120 estimates. Table 2 indicates that pre-production data collection generated 12,284 estimates. The discrepancy between the potential and actual number of estimates, which is often due to item non-response (where we have some, but not all, elements for a quote), can commonly be addressed through imputation. The estimates for ORS pre-production are not based on imputation, as we have not finalized an imputation approach. Research in this area is on-going.
Once estimates are produced, they go through a validation process, by which they are deemed “fit for use.” ORS microdata go through multiple review processes to ensure data entered into the system are correctly coded and any apparent inconsistencies have been documented.10
There are multiple approaches for validating ORS estimates. Visualization tools play a large part in the validation process. First, estimates are reviewed to see whether they conform to expectations. In other BLS programs, such as the National Compensation Survey, expectations may be formed based on past values of the estimates; however, as ORS is a new program, this is not an option for pre-production validation. For pre-production, staff involved in validation examined estimates, using visualization tools and other reports, for outliers and apparent inconsistencies. For example, one should find that landscapers or dishwashers have higher estimates for exposure to wetness than occupations that more typically work indoors and away from sinks. SVP (the amount of time required to learn the techniques, acquire the information, and develop the facility needed for average job performance) should be relatively higher for occupations typically classified as “white collar” versus “blue collar.”
Other approaches to validation involve comparing ORS pre-production estimates to other data sources with similar elements. For example, ORS estimates of SVP can be compared to SVP estimates from O*NET and the Dictionary of Occupational Titles (DOT). Also, BLS collects information on physical environment for the NCS which can be compared to the ORS elements for physical demands.
Where apparent inconsistencies are identified in validation, decisions have to be made whether to suppress the estimates. This process is on-going. If suppressions are made, then the number of estimates issued by BLS will be smaller than those indicated in Table 2. Thus, the last column of table 2 should be seen as an upper bound of the number of estimates from pre-production data.
How will the ORS preproduction estimates be shared with the public? The current plan is to produce an article that highlights a subset of the preproduction estimates, which will be posted on the BLS ORS website before the end of the 2015 calendar year. Once ORS estimates are produced as part of official production, we envision disseminating results through channels that showcase the relationships between data elements. We anticipate that stakeholders will have multiple uses for the estimates and desire to see them presented in different ways. For example, one may want to understand the physical requirements associated with specific occupations. Alternatively, one may be interested in the set of occupations associated with a certain level of SVP. NCS hopes to develop dissemination tools that allow user interaction through dashboards and other technology. The overall goal is to display data visually in addition to access through the public database tools. These represent longer-term plans for dissemination.
With ORS pre-production activities nearly complete, it appears that the collection, estimation, and validation processes undertaken during the testing period will lead to estimates that meet the needs of SSA. Even with a relatively small sample of 2549 establishments, BLS was able to produce and validate over 7,000 estimates at the 2-digit SOC level (spanning 21 2-digit SOCs) and 12,000 estimates across 48 8-digit SOCs.
As we move into full-scale collection of ORS data, refinements to training and procedures will continue in order to ensure data quality with a target of publishing a full complement of estimates on occupational requirements at the 8-digit O*NET SOC level.
Appendix A: List of ORS Elements
|Specific Vocational Preparation -- 4 elements||Physical Demands - Exertion -- 14 elements|
|Minimum Formal Education or Literacy required||Most weight lifted/Carried ever|
|Pre-employment Training (license, certification, other)||Push/Pull with Feet Only: One or Both|
|Prior Work Experience||Push/Pull with Foot/Leg: One or Both|
|Post-employment training||Push/Pull with Hand/Arm: One or Both|
|Pushing/Pulling with Feet Only|
|Mental and Cognitive Demands Elements -- 9 elements||Pushing/Pulling with Foot/Leg|
|Closeness of Job Control level||Pushing/Pulling with Hand/Arm|
|Complexity of Task level||Sitting|
|Frequency of Deviations from Normal Work Location||Sitting vs Standing at Will|
|Frequency of Deviations from Normal Work Schedule||Standing and Walking|
|Frequency of Deviations from Normal Work Tasks||Weight Lifted/Carried 2/3 of the time or more (range)|
|Frequency of verbal work related interaction with Other Contacts||Weight Lifted/Carried 1/3 up to 2/3 of the time (range)|
|Frequency of verbal work related interaction with Regular Contacts||Weight Lifted/Carried from 2% up to 1/3 of the time (range)|
|Type of work related interactions with Other Contacts||Weight Lifted/Carried up to 2% of the time (range)|
|Type of work related interactions with Regular Contacts|
|Physical Demands - Reaching/Manipulation - 14 elements|
|Auditory/Vision -- 10 elements||Overhead Reaching|
|Driving, Type of vehicle||Overhead Reaching: One or Both|
|Communicating Verbally||At/Below Shoulder Reaching|
|Hearing: One on one||At/Below Shoulder Reaching: One or Both|
|Hearing: Group||Fine Manipulation|
|Hearing: Telephone||Fine Manipulation: One Hand or Both|
|Hearing: Other Sounds||Gross Manipulation|
|Passage of Hearing Test||Gross Manipulation: One Hand or Both|
|Far Visual Acuity||Foot/Leg Controls|
|Near Visual Acuity||Foot/Leg Controls: One or Both|
|Peripheral Vision||Keyboarding: 10-key|
|Environmental Conditions -- 11 elements||Keyboarding: Touch Screen|
|Extreme Cold||Keyboarding: Traditional|
|Fumes, Noxious Odors, Dusts, Gases||Physical Demands - Postural -- 7 elements|
|Heavy Vibration||Climbing Ladders/Ropes/Scaffolds|
|High, Exposed Places||Climbing Ramps/Stairs: structural only|
|Humidity||Climbing Ramps/Stairs: work-related|
|Noise Intensity Level||Crawling|
|Proximity to Moving Mechanical Parts||Kneelingl|
|Toxic, Caustic Chemicals||Stooping|
1 The sample design was similar that which will be used in production, but altered to meet test goals.
3 Please see "http://www.ssa.gov/disabilityresearch/occupational_info_systems.html" for additional information on SSAs Occupational Information System.
4 The occupational classification system most typically used by BLS is the six-digit SOC ("https://www.bls.gov/soc/"), generally referred to as “detailed occupations”. O*NET uses a more detailed occupational taxonomy ("https://www.onetcenter.org/taxonomy.html"), classifying occupations at eight-digits and referring to these as “O*NET-SOC 2010 occupations”. There are 840 six-digit SOCs and 1,110 eight-digit SOCs.
5 Though there are 23 2-digit SOCs, one of these, military-specific occupations (55-0000), is out of scope for the NCS and ORS.
6 For more information on sampling and data collection and review, please refer to www.bls.gov/ors/research/research-collection.htm
7 Response rate is calculated as the ratio of the number of usable establishments divided by the sum of usable and refusals.
8 A table of ORS elements is presented in Appendix A at the end of this document.
9 Civilian workers includes private industry and state and local government workers.
10 A more complete description of the ORS review process is presented in the ORS Preproduction Collection Report.
Last Modified Date: January 31, 2017