Bureau of Labor Statistics > Job Openings and Labor Turnover Survey > Home

JOLTS Metropolitan Statistical Area (MSA) Research Estimates Methodology

The JOLTS sample of 16,000 establishments does not directly support the production of sample based state or sub-state estimates. However, Metropolitan Statistical Area (MSA) estimates have been produced for the 18-largest MSAs identified by Current Employment Statistics (CES) Metro Area employment by combining the available sample with model-based estimates, and smoothed by taking a 3-month moving average. These data are experimental. As such, they have not been subject to the same level of review as the current official JOLTS national and regional estimates. BLS is inviting data users to comment on both the methodology used to produce these estimates and on the usefulness of these data.

These estimates consist of four major estimating models; the Composite Regional model (an unpublished intermediate model), the Synthetic model (an unpublished intermediate model), the Composite Synthetic model (published historical series through the most current benchmark year), and the Extended Composite Synthetic model (published current-year monthly series). The Composite Regional model uses JOLTS microdata, JOLTS regional published estimates, and Current Employment Statistics (CES) employment data. The Composite Synthetic model uses JOLTS microdata and Synthetic model estimates derived from monthly employment changes in microdata from the Quarterly Census of Employment and Wages (QCEW), and JOLTS published regional data. The Extended Composite Synthetic extends the Composite Synthetic estimates by ratio-adjusting the Composite Synthetic by the ratio of the current Composite Regional model estimate to the Composite Regional model estimate from one year ago.

The Extended Composite Synthetic model (and its major component—the Composite Regional model) is used to extend the Composite Synthetic estimates because all of the inputs required by this model are available at the time monthly estimate are produced. In contrast, the Composite Synthetic model (and its major component—the Synthetic model) can only be produced when the latest QCEW data are available. The plan is to use Extended Composite Synthetic model estimates to extend the Composite Synthetic model estimates during the annual JOLTS re-tabulation process. The extension of the Composite Synthetic model using current data-based Composite Regional model estimates will ensure that the Composite Synthetic model estimates reflect current economic trends.

The following outlines each model in a non-technical summary format. Each model is summarized separately, and answers the following:

What is the approach attempting to do?
What data inputs are used in the approach?
How does the approach attempt to use that data?
What data outputs are produced by the approach?
What limitations does the approach have?
What more needs to be done?

Composite Regional Model

What Approach?

The Composite Regional approach calculates MSA-level JOLTS estimates from JOLTS microdata using sample weights, and the adjustments for non-response (NRAF). The Composite Regional estimate is then benchmarked to CES MSA-supersector employment to produce MSA-supersector estimates. The JOLTS sample, by itself, cannot ensure a reasonably sized sample for each MSA-supersector cell. The small JOLTS sample results in quite a number of MSA-supersector cells that lack enough data to produce a reasonable estimate. To overcome this issue, the MSA-level estimates derived directly from the JOLTS sample are augmented using JOLTS regional estimates when the number of respondents is low (that is, less than 30). This approach is known as a composite estimate which leverages the small JOLTS sample to the greatest extent possible and supplements that with a model-based estimate. Previous research has found that regional industry estimates are a good proxy at finer levels of geographical detail. That is, one can make a good prediction of JOLTS estimates at the regional-level using only national industry-level JOLTS rates. The assumption in this approach is that one can make a good prediction of JOLTS estimates at the MSA-level using only regional industry-level JOLTS rates.

In this approach, the JOLTS microdata-based estimate is used, without model augmentation, in all MSA-supersector cells that have 30 or more respondents. The JOLTS regional estimate will be used, without a sample-based component, in all MSA-supersector cells that have fewer than five respondents. In all MSA-supersector cells with 5–30 respondents an estimate is calculated that is a composition of a weighted estimate of the microdata-based estimate and a weighted estimate of the JOLTS regional estimate. The weight assigned to the JOLTS data in those cells is proportional the number of JOLTS respondents in the cell (weight=n∕30, where n is the number of respondents).

What data inputs?

All JOLTS microdata records
All weights from JOLTS estimation (final weights that account for sampling weight, NRAF, agg-codes, etc.)
JOLTS published regional rates estimates (regional JO, H, Q, LD, and TS rates)
CES MSA-supersector employment

How are MSAs defined?

MSAs are defined using the most currently available MSA definitions established by the Office of Management and Budget and based on Census Bureau population data. The MSA is delineated using FIPS county and township codes. The current MSA delineation is then extended back through time thus keeping the MSA definition fixed for the entire JOLTS historical series.

How are data used?

All JOLTS microdata are weighted using final weights. A weighted estimate is made for each JOLTS respondent.
Counts are made for each MSA-supersector cell.
Each JOLTS respondent is paired with its regional rate estimate for all variables.
Based on the count of respondents in the MSA-supersector cell the JOLTS respondent belongs to, a Composite Model Weight (CMW) is calculated:
1. If the count is>30, then the CMW for the respondent data=1. The CMW for the regional estimate=0.
2. If the count<5, then the CMW for the respondent data=0. The CMW for the regional estimate=1.
3. If the count is 5–30, then the CMW for the respondent data=n∕30, where n is the number of respondents. The CMW for the regional estimate=1-n∕30.
The MSA-level rate estimate is therefore the final weighted respondent-based JOLTS rate times the CMW added to the regional rate times the CMW, benchmarked to CES MSA-level estimate:
1. FINAL ESTIMATE=CES MSA EMP×((final weight JOLTS rate×CMW)+(regional rate×CMW)).

How are outputs produced?

This model produces MSA-level estimates of JO, H, Q, LD, and TS. These estimates provide estimates for the most current month of estimates and can be produced during monthly JOLTS estimation production.

What are the limitations?

JOLTS data are somewhat volatile at the national and regional levels due to the small sample size which in turn results in volatile MSA estimates.

What more is needed?

These estimates are based upon a model. There is, as of yet, no methodology in place that can produce an estimate of error for the estimates the model produces. Research on a methodology to produce an error estimate is currently underway.

The Composite Regional supersector estimates are summed across MSA industry supersectors to the nonfarm level.

Synthetic Model

What approach?

The Synthetic model differs fundamentally from the Composite Regional model. The Synthetic approach does not use JOLTS microdata but rather it uses data from the QCEW that have been linked longitudinally (Longitudinal Database—LDB), the QCEW-LDB. The Synthetic model attempts to convert QCEW-LDB monthly employment change microdata into JOLTS job openings, hires, quits, layoffs and discharges, and total separations data.

What data inputs?

All monthly employment changes for each record on the QCEW-LDB
JOLTS published regional estimates (regional JO, H, Q, LD, and TS)

How are data used?

Every record on the QCEW-LDB is classified as expanding, contracting, or stable based on monthly employment change.
1. For expanding records, the amount of employment growth is converted to JOLTS hires. They are given no separations.
2. For contracting records, the amount of employment decline is converted to JOLTS separations. They are given no hires.
3. For stable records, no attribution of JOLTS hires or separations is made.
The entire QCEW-LDB is summarized to the US Census regional level.
The QCEW-LDB regional summary is ratio adjusted to the JOLTS published regional estimate for hires and total separations.
1. For each region, the ratio of QCEW-LDB based regional hires and total separations to JOLTS published hires and total separations is calculated (Ratio-H for hires and Ratio-TS for total separations).
2. Each record on the QCEW-LDB within each US Census region will have their converted JOLTS data multiplied by Ratio-H and Ratio-TS, by region.
3. 1. For expanding records, the amount of employment growth is then: (JOLTS hires×Ratio-H). They remain with no separations.
  2. For contracting records, the amount of employment decline is then: (JOLTS separations×Ratio-TS). They remain with no hires.
  3. For stable records, they remain with no JOLTS hires or separations.
To produce MSA-level estimates, sum the regional hires×Ratio-H by MSA to produce a MSA-level JOLTS hires estimate and sum the TS×Ratio-TS by MSA to produce a MSA-level JOLTS total separations estimate.

How are the outputs produced?

Synthetic job openings are a function of the ratio of industry-regional job openings and hires. This ratio of published job openings to hires is applied to model hires estimates to derive model job opening estimates. Ratio-adjusting the JOLTS model hires and separations to the regional published JOLTS hires and separations estimates ensures that the JOLTS published churn rate is fully accounted for.
Synthetic quits and layoffs and discharges are a function of the relative percentage of the individual components of total separations at the industry-regional level. The relative percentages of each component are applied to the model separations estimates to derive model quits and layoffs and discharges.

What are the limitations?

This approach is NOT meant to model individual QCEW-LDB data records. It would not be prudent to use this approach to model small populations (30 or fewer establishments). The model works best at the MSA-level, and while it is possible to model smaller populations, there potentially is a reduction in the strength of the model proportionate to the reduction in the size of the population being modeled
The model does generate MSA-level job openings and separations breakouts. However, these estimates are based upon ratios that are common across the region to which a MSA belongs. If there are significant differences in the ratio of job openings to hires or separations breakouts for any particular MSA (or set of MSAs) within a region, the model cannot detect that and estimates will not reflect those differences.
Since the model is based on QCEW-LDB data, the model cannot produce current MSA-level estimate since QCEW-LDB data lags current JOLTS estimation production by 6–9 months.

What more is needed?

These estimates are based upon a model. There is, as of yet, no methodology in place that can produce any estimate of error for the estimates the model produces. Research on a methodology to produce an error estimate is currently underway.

Composite Synthetic Model

What approach?

The Composite Synthetic model is nearly identical to the Composite Regional model. The primary difference is the use of the Synthetic model estimates (described in the first section) rather than JOLTS published regional estimates when there is an insufficient amount of JOLTS microdata to produce a MSA-supersector estimate.

Just like the Composite Regional approach, the JOLTS microdata-based estimate is used in all MSA-supersector cells that have 30 or more respondents. However, in contrast to the Composite Regional approach, the Composite Synthetic approach uses the Synthetic estimate in all MSA-supersector cells that have fewer than five respondents. In all MSA-supersector cells with 5–30 respondents an estimate is calculated that is a composition of a weighted estimate of the microdata-based estimate and a weighted estimate of the Synthetic estimate. The weight assigned to the JOLTS data in those cells is proportional the number of JOLTS respondents in the cell (weight=n∕30, where n is the number of respondents).

The Composite Synthetic supersector estimates are summed across MSA-supersectors to the nonfarm level. Composite Synthetic estimates are averaged across 3 months, creating a 3-month moving average.

What data inputs?

All JOLTS microdata records
All weights from JOLTS estimation (final weights that account for sampling weight, NRAF, agg-codes, etc.)
Synthetic estimates (regional JO, H, Q, LD, and TS rates)
JOLTS regional-level estimates (to benchmark the MSA estimates)
CES MSA-supersector employment

How are data used?

All JOLTS microdata are weighted using final weights. A weighted estimate is made for each JOLTS respondent.
Counts are made for each MSA-supersector cell.
Each JOLTS respondent is paired with its Synthetic rate estimate for all variables.
Based on the count of respondents in the MSA-supersector cell the JOLTS respondent belongs to, a Composite Model Weighted (CMW) estimate is calculated:
1. If the count is>30, then the CMW for the respondent data=1. The CMW for the Synthetic estimate=0.
2. If the count<5, then the CMW for the respondent data=0. The CMW for the Synthetic estimate=1.
3. If the count is 5–30, then the CMW for the respondent data=n∕30, where n is the number of respondents. The CMW for the Synthetic estimate=1−n∕30.
The MSA-level rate estimate is therefore the final weighted respondent-based JOLTS rate times the CMW added to the Synthetic rate times the CMW, benchmarked to CES MSA-level estimate:
1. FINAL ESTIMATE=CES MSA EMP×((final weight JOLTS rate×CMW)+(synthetic rate×CMW)).

How are outputs produced, and what are the limitations?

This model produces MSA-level estimates of JO, H, Q, LD, and TS. These estimates cannot be produced without lag.

What more is needed?

Extended Composite Synthetic Model

What Approach?

The Extended Composite Synthetic model is designed to project the Composite Synthetic forward until QCEW-LDB data are available to produce Composite Synthetic estimates. The Composite Synthetic estimates are extended using the ratio of the current Composite Regional MSA industry estimate to the Composite Regional MSA industry estimate from one year ago.

This approach ensures that the Extended Composite Synthetic MSA estimates reflect current JOLTS regional and industry-level economic conditions. The Extended Composite Synthetic estimates reflects current JOLTS MSA economic conditions to the extent that sufficient JOLTS microdata are available.

What data inputs?

The historical series of Composite Synthetic model estimates at the MSA-industry-level
The historical series of Composite Regional model estimates at the MSA-industry-level

How are data used?

The Composite Synthetic model estimates are produced at a lag since QCEW-LDB data are only available at a 6–9 month lag relative to JOLTS data. The Composite Regional model estimates, in contrast, are not produced at a lag and are available concurrent with JOLTS data. Therefore, Composite Synthetic estimates can be extended by ratio-adjusting the Composite Synthetic estimates by the ratio of current Composite Regional estimates to the Composite Regional estimates from one year ago at the MSA-industry-level as follows:

Extended Composite Synthetic Model formula

Where

is the Extended Composite Synthetic MSA industry estimate for month t
is the Composite Synthetic MSA industry estimate for month t-12 (one year ago)
is the Composite Regional MSA industry estimate for month t
is the Composite Regional MSA industry estimate for month t-12 (one year ago)

MSA-level estimates are produced by summing the Extended Composite Synthetic estimates over industry.

How are outputs produced, and what are the limitations?

This model will produce MSA-level estimates of JO, H, Q, LD, and TS. These estimates are produced without lag. The methodology allows the Extended Composite Synthetic data to reflect current economic trends at the CESID Industry-Region level. The projection reflects current MSA economic trends where sufficient JOLTS microdata are available.

Sample allocation

What is the sample size allocation for the inputs used to produce the JOLTS MSA estimates?

The JOLTS MSA Research Estimates Sample Allocation table below, provides a snapshot of the sample used in the MSA research estimates. Sample are utilized in both components of the model. The sample component incorporates JOLTS MSA respondent data. The model component incorporates JOLTS regional-level respondent data, CES Metro Area respondent data, and QCEW establishment counts.

SAMPLE ALLOCATION: For MSA Estimator Components
MSA Code	Area Title	JOLTS MSA respondents^[1]	JOLTS Regional respondents^[2]		QCEW Establishments^[3]	CES Metro Area respondents^[4]
MSA Code	Area Title	2018	2018	2019	2018	2018	2019
12060	Atlanta-Sandy Springs-Roswell, GA	135	3,086	2,979	148,425	4,245	4,243
16980	Chicago-Naperville-Elgin, IL-IN-WI	254	2,153	2,096	255,494	9,490	9,205
19100	Dallas-Fort Worth-Arlington, TX	164	3,086	2,979	184,405	4,889	4,516
19740	Denver-Aurora-Lakewood, CO	96	2,034	1,933	106,323	2,945	2,735
19820	Detroit-Warren-Dearborn, MI	92	2,153	2,096	98,957	3,991	3,870
26420	Houston-The Woodlands-Sugar Land, TX	147	3,086	2,979	158,941	4,094	4,005
31080	Los Angeles-Long Beach-Anaheim, CA	306	2,034	1,933	641,292	10,917	10,658
33100	Miami-Fort Lauderdale-West Palm Beach, FL	152	3,086	2,979	223,400	4,251	4,076
33460	Minneapolis-St. Paul-Bloomington, MN-WI	130	2,153	2,096	98,850	2,781	2,768
35620	New York-Newark-Jersey City, NY-NJ-PA	522	1,893	1,862	614,481	15,366	14,910
37980	Philadelphia-Camden-Wilmington, PA-NJ-DE-MD	171	1,893	1,862	167,824	6,181	6,066
38060	Phoenix-Mesa-Scottsdale, AZ	107	2,034	1,933	106,318	3,332	3,318
40140	Riverside-San Bernardino-Ontario, CA	81	2,034	1,933	131,032	3,664	3,616
41740	San Diego-Carlsbad, CA	69	2,034	1,933	115,834	3,659	3,587
41860	San Francisco-Oakland-Hayward, CA	87	2,034	1,933	205,333	5,515	5,450
42660	Seattle-Tacoma-Bellevue, WA	97	2,034	1,933	132,862	3,946	3,864
47900	Washington-Arlington-Alexandria, DC-VA-MD-WV	156	3,086	2,979	194,899	3,050	2,964
71650	Boston-Cambridge-Nashua, MA-NH NECTA	174	1,893	1,862	175,160	5,966	5,851
00000	All Areas	2,940	9,166	8,870	3,759,830	98,282	95,702
Footnotes: ^[1]JOLTS Sample Units used in Sample Component of the Composite & Extended Composite Models ^[2]JOLTS Sample Units used in Model Component of the Composite & Extended Composite; the Total is the sum of the four regions ^[3]QCEW Establishments used in the Model Component of the Synthetic and Composite Synthetic Model ^[4]CES UI Sample Units used in Model Component of the Composite Synthetic & Extended Composite Models

What more is needed?

Last Modified Date: July 31, 2020