Benchmark Article (PDF version)BLS Establishment Estimates Revised to Incorporate March 2007 Benchmarks
IntroductionDaniel Jackson Daniel Jackson is an economist in the Division of Current Employment Statistics, Office of Employment and Unemployment Statistics, Bureau of Labor Statistics. Telephone: (202) 691-6555; e-mail: CESInfo@bls.gov With the release of data for January 2008, the Bureau of Labor Statistics (BLS) introduced its annual revision of national estimates of employment, hours, and earnings from the Current Employment Statistics (CES) monthly survey of nonfarm establishments. Each year, the CES survey realigns its sample-based estimates to incorporate universe counts of employment—a process known as benchmarking. Comprehensive counts of employment, or benchmarks, are derived primarily from unemployment insurance (UI) tax reports that nearly all employers are required to file with State Workforce Agencies. Summary of the benchmark revisionsThe March 2007 benchmark level for total nonfarm employment is 136,533,000; this figure is 293,000 below the sample-based estimate for March 2007, an adjustment of -0.2 percent. Table 1 shows the total nonfarm percentage benchmark revisions for the past ten years. Table 2 shows the nonfarm employment benchmarks for March 2007, not seasonally adjusted, by industry. As is usually the case, benchmark revisions at many industry levels were larger in percentage terms than at total nonfarm, but were offsetting. No individual supersector dominated in terms of the size of revision. Six supersectors had downward revisions. The largest downward revision occurred in manufacturing with a revision of -137,000, or -1.0 percent. The revision is concentrated in machinery, revised by -33,900 or -2.9 percent, plastic and rubber products, revised by -33,400 or -4.4 percent, and computer and electronic products, revised by -29,000 or -2.3 percent. Other supersectors had downward revisions of similar magnitude. Estimates in financial activities were revised -111,000, or -1.3 percent, while estimates were revised downward by 108,000, or 0.8 percent, in leisure and hospitality. Within financial activities, insurance carriers was revised -47,700, or -3.4 percent. Limited-service eating places revised in leisure and hospitality by -46,000, or -1.1. Information had a revision of -54,000, or -1.8 percent. Most of the revision in information was driven by telecommunications, which was revised down by 29,600, or -2.9 percent. Other supersectors with downward revisions were government (-52,000, or -0.2 percent) and education and health services (-39,000, or -0.2 percent). Four supersectors had upward revisions. Trade, transportation, and utilities was revised upward by 140,000, or 0.5 percent. Within the supersector, retail trade dominated with a revision of 107,500, or 0.7 percent. Also contributing was an upward revision in wholesale trade of 21,500, or 0.4 percent, and an upward revision in transportation and warehousing of 11,500, or 0.3 percent. The other supersectors with upward revisions were professional and business services (revised up 44,000 or 0.2 percent), other services (18,000 or 0.3 percent), and construction (6,000 or 0.1 percent). Revisions in the post-benchmark periodPost-benchmark period estimates from April 2007 to October 2007 were calculated for each month based on new benchmark levels, new model-based estimates for the net of birth/death employment, and a slightly new sample composition resulting from the annual sample update (beginning with November). Text table A shows the net birth/death model figures for the supersectors over the post-benchmark period. From April 2007 to December 2007, the cumulative net birth/death model added 883,000, compared with 1,059,000 in the previously published April to December estimates. Text table A. Net Birth/Death Estimates, Post-Benchmark 2007
Why benchmarks differ from estimatesA benchmark revision is the difference between the benchmark employment level for a given March and its corresponding sample-based estimate. The overall accuracy of the establishment survey is usually gauged by the size of this difference. The benchmark revision often is regarded as a proxy for total survey error, but this does not take into account error in the universe data. The employment counts obtained from quarterly unemployment insurance tax forms are administrative data that reflect employer record-keeping practices and differing State laws and procedures. The benchmark revision can be more precisely interpreted as the difference between two independently derived employment counts, each subject to its own error sources.
Like all sample surveys, the establishment survey is susceptible to two sources of error: sampling error and nonsampling error. Sampling error is present any time a sample is used to make inferences about a population. The magnitude of the sampling error, or variance, relates directly to sample size and the percentage of the universe covered by that sample. The CES monthly survey captures slightly under one-third of the universe, exceptionally high by usual sampling standards. This coverage insures a small sampling error at the total nonfarm employment level. Both the universe counts and the establishment survey estimates are subject to nonsampling errors common to all surveys—coverage, response, and processing errors. The error structures for both the CES monthly survey and the UI universe are complex. Still, the two programs generally produce consistent total employment figures, each validating the other. Over the last decade, annual benchmark revisions at the total nonfarm level have averaged 0.2 percent, with an absolute range from less than 0.05 percent to 0.6 percent. Benchmark revision effects for other data typesThe routine benchmarking process results in revisions to the series for production and nonsupervisory workers. There are no benchmark employment levels for these series; they are revised by preserving ratios of employment for the particular data type to all employees employment prior to benchmarking, and then applying these ratios to the revised all-employee figures. These figures are calculated at the basic cell level and then aggregated to produce the summary estimates. Average weekly hours and average hourly earnings are not benchmarked; they are estimated solely from reports supplied by survey respondents at the basic estimating cell level. The aggregate industry level of the hours and earnings series are derived as a weighted average. The production or nonsupervisory worker employment estimates for the basic cells are used as weights for the hours and earnings estimates for broader industry groupings. Adjustments of the all employee estimates to new benchmarks may alter the weights, which, in turn, may change the estimates for hours and earnings of production or nonsupervisory workers at higher levels of aggregation. Generally, new employment benchmarks have little effect on hours and earnings estimates for major groupings. To influence the hours and earnings estimates of a broader group, employment revisions have to be relatively large and must affect industries that have hours or earnings averages that are substantially different from those of other industries in their group. Table 4 gives information on the levels of specific hours and earnings series resulting from the March 2007 benchmark. At the total private level, there was no change in average weekly hours from the previously published level, while average hourly earnings was decreased from the previously published level by 1 cent. MethodsBenchmark adjustment procedure. Establishment survey benchmarking is done on an annual basis to a population derived primarily from the administrative file of employees covered by unemployment insurance (UI). The time required to complete the revision process--from the full collection of the UI population data to publication of the revised industry estimates--is about 10 months. The benchmark adjustment procedure replaces the March sample-based employment estimates with UI-based population counts for March. The benchmark therefore determines the final employment levels, while sample movements capture month-to-month trends. Benchmarks are established for each basic estimating cell and are aggregated to develop published levels. On a not seasonally adjusted basis, the sample-based estimates for the year preceding and the year following the benchmark also are then subject to revision. Employment estimates for the months between the most recent March benchmark and the previous year's benchmark are adjusted using a "wedge-back" procedure. In this process, the difference between the benchmark level and the previously published March estimate for each estimating cell is computed. This difference, or error, is linearly distributed across the 11 months of estimates subsequent to the previous benchmark; eleven-twelfths of the March difference is added to February estimates, ten-twelfths to January estimates, and so on, ending with the previous April estimates, which receive one-twelfth of the March difference. The wedge procedure assumes that the total estimation error accumulated at a steady rate since the last benchmark. Applying previously derived over-the-month sample changes to the revised March level yields revised estimates for the months following the March benchmark. New net birth/death model estimates also are calculated and applied during post-benchmark estimation, and new sample is introduced from the annual update. Benchmark source material. The principal source of benchmark data for private industries is the Quarterly Census of Employment and Wages (QCEW). These employment data are provided to State Employment Security Agencies by employers covered by State UI laws. BLS uses several other sources to establish benchmarks for the remaining industries partially covered or exempt from mandatory UI coverage, accounting for nearly 3 percent of the nonfarm employment total. Data on employees covered under Social Security laws, published by the U.S. Census Bureau in County Business Patterns, are used to augment UI data for industries not fully covered by the UI scope, such as nonoffice insurance sales workers, child daycare workers, religious organizations, and private schools and hospitals. Benchmarks for State and local government hospitals and educational institutions are based on the Annual Census of Governments conducted by the Census Bureau. Benchmark data from these sources are available only on a lagged basis. Extrapolation to a current level is accomplished by applying the employment trends from the UI-covered part of the population in these industries to the noncovered part. Universe data for interstate railroads are obtained from the Railroad Retirement Board. Business birth and death estimation. Regular updating of the CES sample frame with information from the UI universe files helps to keep the CES survey current with respect to employment from business births and business deaths. The timeliest UI universe files available, however, always will be a minimum of 9 months out of date. The CES survey thus can not rely on regular frame maintenance alone to provide estimates for business birth and death employment contributions. BLS has researched both sample-based and model-based approaches to measuring birth units that have not yet appeared on the UI universe frame. Since the research demonstrated that sampling for births was not feasible in the very short CES production timeframes, the Bureau is utilizing a model-based approach for this component. Earlier research indicated that while both the business birth and death portions of total employment are generally significant, the net contribution is relatively small and stable. To account for this net birth/death portion of total employment, BLS is utilizing an estimation procedure with two components. The first component uses business deaths to impute employment for business births. This is incorporated into the sample-based link relative estimate procedure by simply not reflecting sample units going out of business, but imputing to them the same trend as the other firms in the sample. The second component is an ARIMA time series model designed to estimate the residual net birth/death employment not accounted for by the imputation. The historical time series used to create and test the ARIMA model was derived from the UI universe micro level database, and reflects the actual residual net of births and deaths over the past five years. The net birth/death model component figures are unique to each month and include negative adjustments in some months. Furthermore, these figures may exhibit a seasonal pattern observed in the historical UI universe data series. Availability of revised dataLABSTAT, the BLS public database on the Internet, contains all historical employment, hours, and earnings data revised as a result of this benchmark, including both unadjusted and seasonally adjusted data. The data can be accessed at http://www.bls.gov/ces/, the Current Employment Statistics homepage. Conversion to the 2007 North American Industry Classification System Also with the release of the January 2008 data, the CES national nonfarm payroll series were updated to the 2007 North American Industry Classification System (NAICS) from the 2002 NAICS basis. The conversion to NAICS 2007 resulted in minor definitional changes within the manufacturing, information, financial activities, and professional and technical services sectors. The most significant revisions are in the Information sector, particularly within the Telecommunications area. None of the revisions crossed supersector boundaries. In order to avoid time series breaks, all impacted series were reconstructed back to at least 1990. For a small number of series, the reconstruction extends back prior to 1990, to the previously existing start date of the series. The reconstruction methodology is based on the first quarter 2007 unemployment insurance (UI) microdata, which were coded on both a 2002 NAICS and a 2007 NAICS basis. Ratios were established from this dual coded file; the ratios were used to map employment from the 2002 NAICS series to the 2007 NAICS series. For example, the March 2007 employment ratios for 2007 NAICS subsector 50-5171 (wired telecommunications carriers) indicate that 71.6 percent of the series is formed from 2002 NAICS 50-5171, 23 percent of it comes from 2002 NAICS 50-5175 (cable and other program distribution), and 5.4 percent is from 2002 NAICS 50-5181 (ISPs and web search portals). These ratios were applied to the 2002 NAICS series and the results were summed to derive the 2007 NAICS series. The 2002 NAICS to 2007 NAICS employment ratios, or distribution of employment from 2002 NAICS to 2007 NAICS, can be seen in exhibit 1. The 2007 NAICS to 2002 NAICS employment ratios, or the composition of the 2007 NAICS series from 2002 NAICS , can be seen in exhibit 2. Exhibit 1. 2002 NAICS to 2007 NAICS employment ratios
Exhibit 2. 2007 NAICS to 2002 NAICS employment ratios
None of the revisions due to the 2007 NAICS conversion crossed supersector boundaries. However, in some instances, employment levels for impacted supersectors and higher-level aggregates may differ from the previously published levels. Any differences are minimal and are due to rounding of the lower level reconstructed series which are aggregated to form the higher level series. A comparable procedure is used for hours and earnings series. Reconstructed hours and earnings for the impacted series are produced from a weighted average of the 2002 NAICS component series, the weights being the 2002 NAICS to 2007 NAICS ratios (described above). An example of the hours and earnings reconstruction is illustrated in exhibit 3. Exhibit 3. Hours and earnings reconstruction example using 2007 NAICS code 50-517100 - Wireless telecommunications carriers 1
1 Data are derived from March 2006 unemployment insurance data 2 The ratio represents the percent of employment in the 2002 NAICS industry that went into a specific 2007 NAICS industry 3 2007 NAICS data where sum represents new level of production workers, aggregate hours, and aggregate payrolls 4 Average weekly hours = aggregate hours/production workers 5 Average hourly earnings = aggregate payrolls/aggregate hours As mentioned earlier, these ratios were used to reconstruct impacted series back to at least 1990. For April 2007 forward, the data for all 2007 NAICS series were produced in accordance with standard sample-based estimation techniques. The employment, hours, and earnings for impacted series were re-estimated using existing sample reports. Changes to the CES published seriesThe conversion to 2007 NAICS caused several changes to the CES published series. Exhibit 4 shows discontinued 2002 NAICS series that have been reclassified into 2007 NAICS. Exhibit 5 shows new series as a result of 2007 NAICS. Exhibit 6 shows changes in scope to published series due to the 2007 NAICS reclassification. Exhibit 4. Discontinued 2002 NAICS series and reclassification into 2007 NAICS series
Exhibit 5. New series as a result of 2007 NAICS
Exhibit 6. Change in scope due to 2007 NAICS
Additionally, the CES program conducts an annual review of sample adequacy for its estimation and publication cells and makes adjustments to the published series as warranted. This year several all employee series will be discontinued as a result of the annual review of sample employment and universe coverage. Exhibit 7 shows the discontinued all employee series due to the annual sample adequacy review. Exhibit 7. Discontinued all employee series
Review of the sample receipts has also led to the discontinuation of production worker, hours, and earnings estimates for some small industries that no longer have sufficient sample. Exhibits 8 and 9 show the series that will be discontinued. Exhibit 8. Discontinued production worker, hours, and earnings series
Exhibit 9. Discontinued average overtime series
Exhibit 10. Change in title
Seasonal adjustment procedureBLS uses X-12 ARIMA software developed by the U.S. Census Bureau to seasonally adjust national employment, hours, and earnings series derived from the CES program. Individual series are seasonally adjusted using either a multiplicative or an additive model (Exhibit 11), and seasonal adjustment factors are directly applied to the component levels. For employment, individual 3-digit NAICS levels are seasonally adjusted, and higher level aggregates are formed summing these components. Seasonally adjusted totals for hours and earnings are obtained by taking weighted averages of the seasonally adjusted data for the component series. Special model adjustmentsVariable survey intervals. Beginning with the release of the 1995 benchmark, BLS refined the seasonal adjustment procedures to control for survey interval variations, sometimes referred to as the 4- versus 5-week effect. Although the CES survey is referenced to a consistent concept - the pay period including the 12th of each month - inconsistencies arise because there are sometimes 4 and sometimes 5 weeks between the week including the 12th in a given pair of months. In highly seasonal industries, these variations can be an important determinant of the magnitude of seasonal hires or layoffs that have occurred at the time the survey is taken, thereby complicating seasonal adjustment. Standard seasonal adjustment methodology relies heavily on the experience of the most recent 3 years to determine the expected seasonal change in employment for each month of the current year. Prior to the implementation of the adjustment, the procedure did not distinguish between 4- and 5-week survey intervals, and the accuracy of the seasonal expectation depended in large measure on how well the current year’s survey interval corresponded with those of the previous 3 years. All else the same, the greatest potential for distortion occurred when the current month being estimated had a 5-week interval but the 3 years preceding it were all 4-week intervals, or conversely when the current month had a 4-week interval but the 3 years preceding it were all 5-week intervals. BLS adopted REGARIMA (regression with auto-correlated errors) modeling to identify the estimated size and significance of the calendar effect for each published series. REGARIMA combines standard regression analysis, which measures correlation among two or more variables, with ARIMA modeling, which describes and predicts the behavior of data series based on its own past history. For many economic time series, including nonfarm payroll employment, observations are auto-correlated over time; that is, each month’s value is significantly dependent on the observations that precede it. These series, therefore, usually can be successfully fit using ARIMA models. If auto-correlated time series are modeled through regression analysis alone, the measured relationships among other variables of interest may be distorted due to the influence of the auto-correlation. Thus, the REGARIMA technique is appropriate for measuring relationships among variables of interest in series that exhibit auto-correlation, such as nonfarm payroll employment. In this application, the correlations of interest are those between employment levels in individual calendar months and the lengths of the survey intervals for those months. The REGARIMA models evaluate the variation in employment levels attributable to 11 separate survey interval variables, one specified for each month, except March. March is excluded because there are almost always 4 weeks between the February and March surveys. Models for individual basic series are fit with the most recent 10 years of data available, the standard time span used for CES seasonal adjustment. The REGARIMA procedure yields regression coefficients for each of the 11 months specified in the model. These coefficients provide estimates of the strength of the relationship between employment levels and the number of weeks between surveys for the 11 modeled months. The X-12 ARIMA software also produces diagnostic statistics that permit the assessment of the statistical significance of the regression coefficients, and all series are reviewed for model adequacy. Because the 11 coefficients derived from the REGARIMA models provide an estimate of the magnitude of variation in employment levels associated with the length of the survey interval, these coefficients are used to adjust the CES data to remove the calendar effect. These "filtered" series then are seasonally adjusted using the standard X-12 ARIMA software. For a few series, REGARIMA models do not fit well; these series are seasonally adjusted with the X-12 software but without the interval effect adjustment. There are several additional special effects modeled through the REGARIMA process; they are described below. Construction series. Beginning with the 1996 benchmark revision, BLS utilized special treatment to adjust construction industry series. In the application of the interval effect modeling process to the construction series, there initially was difficulty in accurately identifying and measuring the effect because of the strong influence of variable weather patterns on employment movements in the industry. Further research allowed BLS to incorporate interval effect modeling for the construction industry by disaggregating the construction series into its finer industry and geographic estimating cells and tightening outlier designation parameters. This allowed a more precise identification of weather-related outliers that had masked the interval effect and clouded the seasonal adjustment patterns in general. With these outliers removed, interval effect modeling became feasible. The result is a seasonally adjusted series for construction that is improved because it is controlled for two potential distortions: unusual weather events and the 4- versus 5-week effect. Floating holidays. BLS is continuing the practice of making special adjustments for average weekly hours and average weekly overtime series to account for the presence or absence of religious holidays in the April survey reference period and the occurrence of Labor Day in the September reference period, back to the start date of each series. Local government series. A special adjustment also is made in November each year to account for variations in employment due to the presence or absence of poll workers in the local government, excluding educational services series. Refinements in hours and earnings seasonal adjustment. With the release of the 1997 benchmark, BLS implemented refinements to the seasonal adjustment process for the hours and earnings series to correct for distortions related to the method of accounting for the varying length of payroll periods across months. There is a significant correlation between over-the-month changes in both the average weekly hour (AWH) and the average hourly earnings (AHE) series and the number of weekdays in a month, resulting in noneconomic fluctuations in these two series. Both AWH and AHE show more growth in "short" months (20 or 21 weekdays) than in "long" months (22 or 23 weekdays). The effect is stronger for the AWH than for the AHE series. The calendar effect is traceable to response and processing errors associated with converting payroll and hours information from sample respondents with semi-monthly or monthly pay periods to a weekly equivalent. The response error comes from sample respondents reporting a fixed number of total hours for workers regardless of the length of the reference month, while the CES conversion process assumes that the hours reporting will be variable. A constant level of hours reporting most likely occurs when employees are salaried rather than paid by the hour, as employers are less likely to keep actual detailed hours records for such employees. This causes artificial peaks in the AWH series in shorter months that are reversed in longer months. The processing error occurs when respondents with salaried workers report hours correctly (vary them according to the length of the month), which dictates that different conversion factors be applied to payroll and hours. The CES processing system uses the hours conversion factor for both fields, resulting in peaks in the AHE series in short months and reversals in long months. REGARIMA modeling is used to identify, measure, and remove the length-of-pay-period effect for seasonally adjusted average weekly hours and average hourly earnings series. The length-of-pay-period variable proves significant for explaining AWH movements in all the service-providing industries except retail trade. For AHE, the length-of-pay-period variable is significant for wholesale trade, financial activities, professional and business services, and other services. All AWH series in the service-providing industries except retail trade have been adjusted from January 1990 forward. The AHE series for wholesale trade, financial activities, professional and business services, and other services have been adjusted from January 1990 forward as well. For this reason, calculations of over-the-year change in the establishment hours and earnings series should use seasonally adjusted data. The series to which the length-of-pay-period adjustment is applied are not subject to the 4- versus 5-week adjustment, as the modeling cannot support the number of variables that would be required in the regression equation to make both adjustments. See Exhibit 11 for series that have the calendar effects modeling described above. Exhibit 11. Model specifications.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||