Preproduction Collection Report : U.S. Bureau of Labor Statistics

Bureau of Labor Statistics > Occupational Requirements Survey > Home

Occupational Requirements Survey (ORS) Data Review Process

Ruth Meharenna¹

¹ U.S. Bureau of Labor Statistics, 2 Massachusetts Ave., NE, Room 4160,

Washington, DC 20212

Abstract

The Occupational Requirements Survey (ORS) is an establishment survey conducted by the Bureau of Labor Statistics (BLS) for the Social Security Administration (SSA). The survey collects information on the vocational preparation and the cognitive and physical requirements of occupations in the U.S. economy, as well as the environmental conditions in which those occupations are performed. This paper provides an overview of the ORS Review Program including information on the review processes, systems, and tools. The review process ensures that data are coded correctly and that documentation is sufficient. Review takes a variety of forms, such as edits in the computer system to catch erroneous data and edits that look for unusual data. One review process targets specific elements that an employer reports, while a second review process looks at all the data an employer provides. This paper discusses how all these review processes work together to ensure the quality and transparency of the data.

Keywords: Review process, outliers, microdata review, data quality, data accuracy, edits

1. Introduction

In the summer of 2012, the Social Security Administration (SSA) and the Bureau of Labor Statistics (BLS) signed an interagency agreement, which has been updated annually, to begin the process of testing the collection of data on occupations. As a result, the Occupational Requirements Survey (ORS) was established as a test survey in late 2012. The goal of ORS is to collect and publish occupational information that will replace the outdated data currently used by SSA. More information on the background of ORS can be found in the next section. All ORS products will be made public for use by non-profits, employment agencies, state or federal agencies, the disability community, and other stakeholders.

An ORS interviewer attempts to collect close to 70 data elements (shown in appendix, table1) related to the occupational requirements of a job. The following four groups of information will be collected:

Physical demand characteristics/factors of occupations (e.g., strength, hearing, or stooping)
Specific vocational preparation requirements, which include educational requirements, experience, licensing and certification and post-employment training
Mental and cognitive demands of work
Environmental conditions in which the work is completed

Field testing to date has focused on developing procedures, protocols, collection aids, and microdata review processes using the National Compensation Survey (NCS) platform. It was from this field testing that a comprehensive review program was developed. The ORS Review Program encompasses varying review levels and requires active, constructive, and integrated roles on the part of field economists, regional office staff, and national office staff. It specifically includes both a quality assurance component and microdata review component for ensuring data accuracy. This paper presents an overview of the ORS Review Program. Section 2 provides background information on the Occupational Requirements Survey. Section 3 provides information on the ORS review program and review process, edits, tools and systems used by the National Office for the review of the ORS microdata. The paper ends with a summary and a description of additional review processes still being developed.

2. Background Information on ORS

In addition to providing Social Security benefits to retirees and survivors, the Social Security Administration (SSA) administers two large disability programs which provide benefit payments to millions of beneficiaries each year. Determinations for adult disability applicants are based on a five-step process that evaluates the capabilities of the worker, the requirements of their past work, and their ability to perform other work in the U.S. economy. In some cases, if an applicant is denied disability benefits, SSA policy requires adjudicators to document the decision by citing examples of jobs the claimant can still perform despite restrictions (such as limited ability to balance, stand, or carry objects) ^[1].

For over 50 years, the Social Security Administration has turned to the Department of Labor's Dictionary of Occupational Titles (DOT) ^[2] as its primary source of occupational information to process the disability claims. SSA has incorporated many DOT conventions into their disability regulations. However, the DOT was last updated in its entirety in the late 1970’s, and a partial update was completed in 1991. Consequently, the SSA adjudicators who make the disability decisions must continue to refer to an increasingly outdated resource because it remains the most compatible with their statutory mandate and is the best source of data at this time.

When an applicant is denied SSA benefits, SSA must sometimes document the decision by citing examples of jobs that the claimant can still perform, despite their functional limitations. However, since the DOT has not been updated for so long, there are some jobs in the American economy that are not even represented in the DOT, and other jobs, in fact many often-cited jobs, no longer exist in large numbers in the American economy.

SSA has investigated numerous alternative data sources for the DOT such as adapting the Employment and Training Administration’s Occupational Information Network (O*NET) ^[3], using the BLS Occupational Employment Statistics program (OES) [4], and developing their own survey. But they were not successful with any of those potential data sources and turned to the National Compensation Survey program at the Bureau of Labor Statistics.

NCS is a national survey of business establishments conducted by the BLS ^[5]. Initial data from each sampled establishment are collected during a one year sample initiation period. Many collected data elements are then updated each quarter while other data elements are updated annually for at least three years. The data from the NCS are used to produce the Employer Cost Index (ECI), Employer Costs for Employee Compensation (ECEC), and various estimates about employer provided benefits. Additionally, data from the NCS are combined with data from the OES to produce statistics that are used to help in the Federal Pay Setting process.

In order to produce these measures, the NCS collects information about the sampled business or governmental operation and about the occupations that are selected for detailed study. Each sample unit is classified using the North American Industry Classification System (NAICS) ^[6]. Each job selected for study is classified using the Standard Occupational Classification system (SOC) ^[7]. In addition, each job is classified by work level – from entry level to expert, nonsupervisory employee to manager, etc. ^[8]. These distinctions are made by collecting information on the knowledge required to do the job, the job controls provided, the complexity of the tasks, the contacts made by the workers, and the physical environment where the work is performed. Many of these data elements are very similar to the types of data needed by SSA for the disability determination process.

All NCS data collection is performed by professional economists or statisticians, generically called field economists. Each field economist must have a college diploma and is required to complete a rigorous training and certification program before collecting data independently. As part of this training program, each field economist must complete several training exercises to ensure that collected data are coded the same way no matter which field economist collects the data. NCS uses processes like the field economist training to help ensure that the data collected in all sectors of the economy in all parts of the country are coded uniformly.

SSA asked the NCS to partner with them under an annual interagency reimbursable agreement to test the NCS ability to use the NCS infrastructure to collect data on occupational requirements.

If BLS is able to collect these data about work demands, SSA would have new and better data to use in its disability programs. SSA cited three key advantages of using NCS to provide this updated data:

Reputation - SSA was impressed with the BLS reputation for producing high quality, statistically accurate data that are trusted by our data users and follow statistically accepted methods and principles.
Trained Workforce – SSA was also impressed that NCS Field Economists have experience collecting information about occupations in America’s work force and collecting data similar to that needed by SSA.
Survey Infrastructure - After attempting to develop their own survey, SSA was also impressed with the fact that NCS has infrastructure in place across the country to manage and implement a new survey to meet their data needs as well as systems and processes to support all the steps of the survey.

Since 2012, NCS has been testing our ability to collect these new data elements using the NCS survey infrastructure. Field testing to date has focused on developing procedures, protocols, and collection aids using the NCS infrastructure. These testing phases were analyzed primarily using qualitative techniques but have shown that this survey is operationally feasible.

The pre-production test might better be described as a “dress rehearsal” as the collection procedures, data capture systems, and review processes were structured to be as close as possible to those that will be used in production. The sample design for the pre-production test was similar to that which will be used in production, but was altered to meet test goals. While the feasibility tests in FY 2014 and earlier were intended to gauge the viability of collecting occupational data elements and to test modes of collection and procedures, in FY 2015 BLS integrated the prior work into a large-scale nationally representative pre-production test. For more information on the pre-production test there is a BLS website ^[9].

3. ORS Review Program

3.1 An Overview

The ORS Review Program ensures the accuracy, consistency, and integrity of the ORS microdata. It is a comprehensive program that serves several purposes including: problem identification and resolution, data correction and documentation, economist certification, data integrity verification, and development of future data expectations, review edits, and guidelines. Through the ORS Review Program, problems are identified, communicated in a variety of feedback loops to affected offices, and resolved once root causes are addressed. The resolution process includes problem identification, individual mentoring, group training, refinement of procedures, refinement of review edits, and systems development. It is this dedication to data accuracy and data quality that is the foundation on which accurate survey estimates are produced.

The ORS Review Program includes varying review processes and is conducted by regional, quality assurance, and national office staff economists.

3.1.1. Field Office Regional Processes

Self-Review: Review completed by the individual data collector utilizing prompts, or edits, generated by the data capture system to identify potential data issues and possible data corrections. All edits or queries flagged by the data capture system must be verified prior to data submission.
Mentor Review – Regional Observation Review (ROB): Experienced data collectors are paired with inexperienced data collectors as mentors and mentees. Review consists of observations of data collection interviews and review of all collected data as entered into the data capture system. Purposes of this review include skills development (i.e., conducting and collecting data through interviews) and providing a forum for analysis of data capture for accuracy and adherence to survey procedures as well as data collector certification.

3.1.2. Field Office Quality Assurance Program

Staff Development Analysis (SDA): All data elements are reviewed by quality assurance analysts using a question and answer review approach. Goals include data accuracy and corrections, continued staff development and support, and adherence to survey procedures.
Technical Re-Interview Program (TRP): Independent data review through respondent re-contact that includes a random selection of occupations and data elements to be reviewed. Review assesses the interaction between the data collector and the respondent as well as the accuracy of the data captured. This is the primary means for data integrity verification.

Secondary review as described under the National Offices Processes is also completed as part of Mentor, Staff Development Analysis, and Technical Re-Interview Program review.

3.1.3. National Office Processes

Targeted Review: Combination review approach in which certainty data elements are reviewed for all occupations (i.e., quotes) in the collection unit (i.e., establishment) as well as randomly selected data elements for select quotes. The purpose of this review includes focused review of the more complicated and interrelated data elements and verification of microdata with estimation and publication impact.
Secondary Review: In addition to queries (or edits) that have been included in the data capture system, additional queries (i.e., secondary edits) are run against all data elements outside the primary data capture system. These secondary edits explore the more complex relationships between the various data elements. It is this review that is also used to develop and analyze new edits before they are moved into the data capture system. An example of a secondary edit: “Coding for decision making exceeds what is expected for this SVP level”. Only data that fails a secondary edit is reviewed to determine whether further data collector clarification is needed.
Cross-Establishment Review: Review of data by selected criteria, such as worker characteristics, industry, or area, across all establishments to identify outliers and unexpected outcomes, or trends and patterns in the data. This review is relatively new to the program and will continue to evolve as the ORS program evolves.
Statistical Review: Review performed to determine whether further clarification(s) is needed from the data collector in order to calculate accurate sampling weights, as the weights have an impact on the estimation processes. This review focuses on a comparison of the establishment assigned for collection to the establishment actually collected, to ensure they are the same. When the two units differ, weight adjustments are implemented.

3.2 ORS Review Process

The ORS review program and processes evolved over multiple phases of testing, collection, review, analysis, and evaluation. Early phases of collection and testing focused on how to collect the data. Initial edits were developed based on expectations and experience collecting similar surveys, and outside sources such as the O*NET. Collected data were reviewed for reasonableness and adherence to collection procedures. During each phase, and for each test, once all of the individual establishment data were collected and reviewed, they were analyzed to evaluate relationships within elements, between elements, and amongst related elements. Edits were refined based on analysis of the data, changes to procedures, and continued evaluation of the edit outcomes. If edits were rarely being flagged, or if they were flagging but not resulting in data being changed, they were re-evaluated to determine whether to keep or drop the edit, or modify the parameters that triggered the edit.

The review process also identified elements that data collectors appeared to be having difficulty coding, or complex relationships and concepts that data collectors may have had difficulty understanding. The findings were communicated back to the procedures and training groups who worked with SSA to refine the procedures and provide additional training and guidance. This cycle of training, collection, review, analysis, and evaluation continued through three phases of testing, six specific feasibility tests, and one Pre-Production test.

3.2.1. Additional Review Tools and Approaches

In addition to the ORS processes, and edits being developed and refined, review tools were being developed, and continue to be developed, to assist reviewers. These tools include data visualization and cross-establishment review of data.

Data visualization is displaying data in a manner that allows people to explore or communicate the data efficiently. Good data visuals should enable anyone to see the big picture, easily compare values, and find patterns among the values; each of these should lead to a better understanding of the subject. The data visualization essentially turns thousands of rows of data into an interactive picture, graph, and dashboard where the reviewer can easily see all of the data at once. It provides context about the job being reviewed by showing other comparable data points and also provides some insight into the relationships between the elements.
Cross-establishment review is the ability to look at data, or data elements, by specific criteria or combination of criteria, across all establishments; and helps the reviewer identify outliers, trends, or patterns in the data.

3.2.2. Integrating the Data Collection and Review Processes

An overview of the data collection and review process follows:

Field economists collect data from the respondents.
Data are coded in a data capture system. The data capture system has its own set of edits that a data collector must address before they can mark the establishment complete. Some of these edits are called Level 1 edits, which means the data must be changed before the establishment can be marked complete. This would include blank data fields or situations where coding is not valid such as, “Gross Manipulation hours coded is greater than work schedule hours per day”. Level 2 edits can be addressed with documentation; and do not necessarily need to be changed.
Field economists (FEs) conduct a self-review of the data and address all review edits generated by the data capture system. FEs provide supporting documentation if the data entered are correct but fail the edit, or they will note the data that were changed if there was an error identified by the edit. Documentation is used to explain unexpected data during various reviews and post-collection processing; and to improve the edits for future collection efforts.
When the preceding steps have been accomplished, the establishment is marked complete.
Establishments marked complete go through a Review Management Tool and randomly get assigned to one of the following specific types of review.
- Staff Development Analysis (SDA)
- Technical Re-Interview Program (TRP)
- Targeted
- Secondary
Secondary edits are run outside of the data capture system, and are stored in an ORS review and communication application. Secondary edits may explore some of the more complex relationships between elements; or may be edits that are being tested or implemented for the first time. They are created as secondary edits so that we can evaluate them and more easily modify them if necessary. Once they are refined, these edits may be moved into the data capture system. The secondary edits are used across all review types.

As mentioned previously, mentor review is a regional review process where an experienced data collector observes the collection and data coding process of an inexperienced data collector, and reviews the data. These schedules receive full review, and therefore do not need to be assigned to additional review by the Review Management Tool. Statistical review is conducted by the mathematical statisticians working on ORS and occurs on all establishments in the survey in addition to any other assigned review; therefore, it also does not need to be assigned by the Review Management Tool. The chart below shows the typical review process and includes approximate percentages of the establishments that undergo each type of review. The establishments are assigned to a review type based on specific percentages determined prior to the collection start. Approximately fifteen percent (15%) of establishments are assigned to SDA, five percent (5%) to TRP, twenty percent (20%) to Targeted, and sixty percent (60%) to Secondary.

Once an establishment is marked complete, it will go through the secondary edits. Any data flagged by the edits must be addressed by the applicable reviewer. Flagged edits are automatically loaded into the ORS review and communication application, where the reviewer will determine if the documentation adequately explains and supports the coding. If the documentation supports the coding, the reviewer will validate the flagged edit. If the documentation does not support the coding, the reviewer will send a question back to the data collector.

The review and communications application serves multiple purposes.

It is a repository for all secondary edits flagged in review.
It allows a reviewer to enter adhoc questions to the data collector if the review is focusing on additional reviews of the data, or any specific data improvement projects where existing edits may not have been flagged.
It is the primary means of communication between the reviewer and the data collector; and is a repository for all communication (questions and answers) surrounding review of the establishment.
It allows for quick access to the data capture system, if the reviewer or data collector needs to review additional data or documentation from the establishment.
Designation of review status. Once review is complete, the reviewer will mark the establishment complete.

The ORS Review Program includes multiple review processes, as indicated in the chart. Targeted, Secondary, Cross-establishment and Statistical reviews are done at the National office. Targeted review focuses on randomly selected data elements (an example is shown in appendix, table 2) and certain data elements that were challenging during previous testing phases and that impact estimates and publications. Secondary review focuses on more complex edits that deal with inter-relationship of data elements. Cross-establishment review focuses on outliers across establishments and elements. Statistical review analyzes data elements that can affect sample weights.

This paper focuses on two of the review types: Targeted review and Secondary review. The goal of Targeted review and Secondary review, often known more generically as microdata review, and conducted at the individual establishment level, is to ensure that the data in the establishment are correctly coded and that documentation is sufficient. Additionally, it is important that as microdata are reviewed, these data support ORS estimation and validation processes, and ultimately the publication of accurate, consistent, high quality data. A total of 4,515 review questions were asked by reviewers during the pre-production test. Of the 4,515 questions asked, 2740 resulted in data being changed. A total of 7667 secondary edits were validated by reviewers without further action required of the field economists.

3.3 Targeted Review

Targeted review is a combination review approach in which select data elements are reviewed for all occupations collected from the establishment (called “certainty” elements). In addition other randomly selected data elements are reviewed for select occupations. The purpose of this review includes focused review of the more complicated and interrelated data elements and verification of microdata with estimation and publication impact.

Once an establishment has been selected for Targeted review, the ORS review and communication application randomly chooses occupations and data elements to be reviewed. Additionally, the reviewer checks the following certainty elements for all quotes

Standard Occupational Classification (SOC) [6]: reviewers verify that the job description for all occupations (quotes) matches the definition in the SOC manual
Specific Vocational Preparation (SVP): reviewer checks SVP coding and verifies that the documentation submitted supports coding
Sit/Stand hours and coding: reviewers check the hours reported sitting and standing hours against the documentation submitted. They verify that the job is sitting/standing enough to perform other coded activities that require sitting or standing.

There are two element groups in Targeted review: Group A and Group B. The review and communications application also randomly selects one of the four sets of Group A elements and one of the five sets of Group B elements to be reviewed. Here are the steps in the selection process for a sampled establishment:

Randomly selects 2 occupations from the establishment to review
Randomly selects 1 of the 4 sets of elements in Group A to review
Randomly selects 1 of the 5 sets of elements in Group B to review

Below are the sets of data elements from each group that may be selected.

Group A Elements:

Use of upper extremities: Reaching Overhead, Reaching at or below shoulder, Gross Manipulation, Fine Manipulation
Use of lower extremities: Climbing stairs, Climbing ladders, Foot/leg controls, Crawling, Crouching, Stooping, Kneeling
Strength: Lift/carry max ,Lift/carry, Push/pull - Hand/arm, Push/pull – Feet/leg, Keyboarding
Knowledge/complexity: Cognitive (adaptability, pace, decision making, supervision)

Group B Elements:

Vision/ driving: Driving, Near, Far, Peripheral
Other senses: Communicating verbally, Noise intensity, Hearing (One on one, Group, Telephone, Other sounds, Pass hearing test), Cognitive (Work related personal interactions)
Climate: Heat, Cold, Outdoors, Wetness
Danger: High exposed places, Proximity to moving mechanical parts, Hazardous Contaminants
Miscellaneous : Heavy vibrations, Leveling (Knowledge), Leveling (JCC)

The targeted review edits are presented to the reviewer in the communication application and the randomly selected data elements from Group A and B are displayed. The reviewer checks the data elements in the two groups that have been randomly selected. The reviewer verifies that coding and documentation match, checks if any activities are likely to require elements to be present concurrently and verifies that the amount of time coded accounts for this, and checks if any activities require elements that are mutually exclusive and verifies that the amount of time coded accounts for this. Reviewers send any needed clarifications to the data collector via the communication application.

In addition to completing the targeted review as described above, the analyst also reviews all secondary edits that have flagged and verifies that data are explained. Secondary review, which is described next, is conducted concurrently with the targeted review of the establishment.

3.4 Secondary Review

The purpose of secondary review is to ensure that the data in the establishment are correctly coded and that documentation adequately explains and supports the coding. These secondary edits explore the more complex relationships between the various data elements. It is this review that is also used to develop and analyze new edits before they are moved into the data capture system.

In secondary review, only flagged secondary edits (an example is shown in appendix, table 3) are reviewed to determine whether clarification is needed from a data collector. The reviewer can validate the flagged edits or ask questions of the data collector to resolve any unusual coding. Unusual data may already be addressed by documentation in the completed establishment, including documentation in response to data capture system edits. If the documentation is sufficient, the reviewer will validate the query in the communication application. The reviewer will also ask any other questions that are not sufficiently supported by the occupational description and general information. All the communication between FEs and Reviewers is done in the communication application.

The reviewer investigates and resolves all secondary edits that have flagged to accomplish the following:

Check assigned establishments for completeness
Verify that the establishment and occupations (quotes) have both useful and relevant data inputs
Check accuracy of coding
Verify that data entered in the system are validated and documented
Check whether any activities coded are likely to require that elements be present concurrently and verify that the amount of time coded accounts for this.
Evaluate whether any activities coded require elements that are mutually exclusive and verify that the amount of time coded accounts for this.

The data review and communications application is the primary system used in secondary review, and reviewers use the data capture system to check for documentation.

4. Summary

The ORS review program is a comprehensive review program. There are many review processes under one overall umbrella of review. Each review process has its own intended purposes and goals. The regions perform Mentor review as part of their training process, which includes observing collection and write-up of the collected data, by experienced staff. The Quality Technical Re-Interview Program (QTRP) team conducts the Technical Re-Interview Program (TRP), which is an independent data review through respondent re-contact, and is a primary means for data integrity verification. The regions also perform Staff Development Analysis (SDA), which is a full data review with the goals of data accuracy and staff development. National Office review includes Targeted review, Secondary review, Statistical review, and Cross-Establishment review. Targeted review focuses on the more complicated and interrelated data elements and verification of microdata with estimation and publication impact. Secondary review looks at secondary edits that explore the more complex relationships between the various data elements. Statistical review determines if the sample related data are correct in order to calculate accurate sampling weights. Cross-Establishment review is a review of data by selected criteria, across all establishments, to identify outliers and unexpected outcomes, or trends and patterns in the data.

Many tools aid in the various review processes; one of the most significant tools is the edits. Edits have been developed since the very early phases of the Occupational Requirements Survey. They are continuously refined; new edits are created based on review findings and analysis of the data; ineffective edits are modified or deleted.

Additional review tools have been developed for the ORS survey. Data visualization and cross-establishment review are examples of tools that are improving the review processes. These tools are in their beginning stages and look across workers by SOC, NAICS, worker characteristics etc., in order to provide better data for the ORS estimates. They have been tested and used in the Pre-Production test by staff performing Targeted and Secondary review. They will continue to be developed and improved, with the goal of expanding their use, not only in terms of the amount of data, but also expanded for use by the other review processes.

^[1] Social Security Administration, Occupational Information System Project, http://www.ssa.gov/disabilityresearch/occupational_info_systems.html
^[2] U.S. Department of Labor, Employment and Training Administration (1991), “Dictionary of Occupational Titles, Fourth Edition, Revised 1991”
^[3] U.S. Department of Labor, O*Net Online, http://www.onetonline.org/
^[4] U.S. Bureau of Labor Statistics (2008) BLS Handbook of Methods, Occupational Employment Statistics, Chapter 3. https://www.bls.gov/opub/hom/pdf/homch3.pdf
^[5] National Compensation Survey, NCS, https://www.bls.gov/ncs/
^[6] See North American Industry Classification System https://www.census.gov/eos/www/naics/
^[7] See Standard Occupational Classification website, https://www.bls.gov/soc/
^[8] U.S. Bureau of Labor Statistics, National Compensation Survey: Guide for Evaluating Your Firms’ Jobs and Pay, May 2013 (Revised), https://www.bls.gov/ncs/ocs/sp/ncbr0004.pdf
^[9] https://www.bls.gov/ors/#pretesting

Any opinions expressed in this paper are those of the author and do not constitute policy of the Bureau of Labor Statistics or the Social Security Administration

Appendix - ORS Data Elements

Table 1 shows the list of ORS data elements. The communication application randomly chooses occupations (quotes) and data elements for targeted review. Table 2 shows an example of the 2 occupations (quotes) randomly selected as well as 1 of the 4 groups of data elements in Group A, and 1 of the 5 groups of data elements in Group B also randomly selected. Table 3 shows examples of secondary review edits.

Table 1: Data Elements

Table 2: Example of Targeted Review Edits

Table 3: Example of Secondary Review Edits

Last Modified Date: December 10, 2015