Department of Labor Logo United States Department of Labor
Dot gov

The .gov means it's official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Handbook of Methods Consumer Price Index Data Sources

Consumer Price Index: Data Sources

The Consumer Price Index (CPI) is a measure of the average change over time in the prices paid by consumers for a representative basket of consumer goods and services. The CPI measures inflation as experienced by consumers in their day-to-day living expenses. The CPI is used to adjust income eligibility levels for government assistance, federal tax brackets, federally mandated cost of living increases, private sector wage and salary increases, and consumer and commercial rent escalations. Consequently, the CPI directly affects hundreds of millions of Americans.

The CPI is created from a series of interrelated surveys. The CPI requires

  • a geographic sample, which is a set of areas where prices will be collected;
  • a survey of consumer expenditures to create and appropriately weight a market basket of goods and services to be priced and to create a sample of outlets in which prices are collected; and
  • samples of prices for commodities, services, and housing.

Geographic sample

Using 2010 census population data, we select the urban areas from which data on prices are collected and choose the housing units within each area that are eligible for use in the shelter component of the CPI. The census data also provide information on the number of consumers represented by each area selected as a CPI price collection area. Additional information on the process of creating the geographic sample is available in the design section.

Consumer expenditure data

The CPI seeks to measure the change in the cost of living by measuring the average change over time in the prices paid by urban consumers for a market basket of consumer goods and services. For the CPI to be accurate, the market basket must correspond to what consumers are actually purchasing, and the different categories of items must be weighted to reflect their proportions in consumers' budgets.

The CPI uses data from the Consumer Expenditure (CE) survey to determine the weights of the different categories of goods and services in the CPI. The CE survey collects data on the out-of-pocket expenses spent to acquire all consumer products and services. The CPI uses the CE data to identify the goods within the CPI's scope. Information about the scope of the CPI is available in the concepts section. Annual CE data are used for the CPI-U and CPI-W weights; these expenditure weights are updated biennially. For example, annual CE data from 2015 and 2016 were used to construct a set of weights that were implemented in the CPI at the end of 2017 and were used through the end of 2019. Additional information about the CE survey is available in the Consumer Expenditure section of the BLS Handbook of Methods.

Price data

The most fundamental data in the CPI are prices. CPI price data are collected via two surveys: one survey collects prices for commodities and services and the other survey collects prices for rent.

Commodities and services survey

The CPI survey collects about 94,000 prices per month to compute indexes for commodities and services. Approximately two-thirds of price collection in the CPI is done by personal visits of CPI data collectors to brick-and-mortar stores. The remaining data are collected by telephone or on the outlet’s website. In some cases, these data are supplemented by data provided from other sources.

The outlets where prices are collected are selected based on data from the CE survey. These outlets may be brick-and-mortar stores or websites (e-commerce); currently, about 8 percent of CPI quotes are collected from outlet websites.

Some secondary sources are also used in constructing the CPI sample. For example, data from the U.S. Department of Transportation database are used to construct the sample of fares in the airline fares index.

Housing survey

The CPI survey collects about 8,000 rental housing unit quotes each month to compute the indexes for the housing component. It uses price quotes for rent and homeowners' equivalent rent (an estimate of the implicit rent that owner occupants would have to pay if they were renting their homes) to compute estimates of price change. Because rents change rather infrequently, the CPI program collects rent data from each sampled unit every 6 months. Collecting rent data less frequently allows for a much larger sample. The CPI divides each area’s rent sample into six sub-samples called panels. The rents for panel 1 are collected in January and July; panel 2, in February and August, etc. Rents are collected by personal visit or phone.1

Repricing and quality adjustment

Prices for each item in the commodities and services survey are collected either every month or every other month, depending on the type of good or service and its location. Food at home, energy, and selected other items are priced monthly. So, too, are prices for all other commodity and service items in the three largest publication areas: New York, Los Angeles, and Chicago. Elsewhere, prices are collected bimonthly for the remaining commodity and service items; those are assigned to either even- or odd-numbered months for pricing.

Most repricing is done by personal visit from a CPI data collector, but in other cases repricing is done by a website visit or by telephone. If the selected item is available, a data collector records its price and the recorded information is reviewed by commodity analysts who have detailed knowledge about the particular good or service. Unusual price movements are reviewed carefully and checked for validity. The price index formula cannot handle a price of zero (or free), therefore, a zero price is adjusted to a very small price.

If the selected item is no longer available, or if there have been changes in the quality or quantity (for example, a container of orange juice containing 59 ounces instead of 64 ounces) of the good or service since the last time prices were collected, the data collector selects a new item similar to the old item. This is referred to as a substitution.

When substitution occurs, the commodity analyst reviews the new item and price. The new price may be quality adjusted for use in index computation. Conceptually, the CPI seeks to be a constant-quality measure, though accurately quantifying quality change may not always be possible. Detailed information about quality adjustment procedures is in the calculation section.

Alternative data sources

Although most of the prices used to compute the CPI are collected by BLS through the process decribed above, in some cases these data are supplemented by data from other sources.

Airline fares

Data from the U.S. Department of Transportation database are used to construct the sample of fares in the airline fares index.

Apparel and household goods

Among the many firms that participate in the CPI survey, one firm provides BLS with a large volume of price data rather than allowing data collectors to collect data in stores. Additional information on this methodology is available in the paper Big Data in the U.S. Consumer Price Index: Experiences & Plans.”

Postage

For sample selection, the delivery of household mail by type of postal service and postal zone is determined by the United States Postal Service (USPS) Household Diary Survey. We collect monthly prices from the price list on the USPS public website.

Prescription drugs

One firm provides BLS with a large volume of price data rather than allowing data collectors to collect data in stores. Additional information on how BLS prices prescription drugs is available in the medical care factsheet.

Used cars and trucks

For used cars and trucks, both the sample and prices come from alternative sources. The current CPI sample of used cars and trucks comes from the J. D. Power Information Network, which is a network of car dealers who report sales of used vehicles to the J. D. Power Company. From the universe of 2- through 8-year-old vehicles, we choose a sample of 480 vehicles. The 480 observations are replicated in all of the CPI areas (after tax adjustments). The sample is updated by one model year each September, October, or November to maintain the same-age vehicles over time. If a production model is discontinued, it is replaced by a comparable model and a complete resampling is conducted every 5 years.

All price information for used cars and trucks in the CPI comes from the National Automobile Dealers Association Official Used Car Guide (All prices are adjusted for depreciation of the vehicle. Additional information about how BLS prices used cars is available in the used cars and trucks factsheet.)

Confidentiality

Data from the pricing surveys are collected under pledges of confidentiality, and BLS is bound by law to protect the confidentiality of respondents. Data collection and security procedures are governed by provisions of the Confidential Information Protection and Statistical Efficiency Act of 2002 (CIPSEA).

The BLS confidentiality pledge

Respondents to BLS surveys will receive the confidentiality pledge assuring them of BLS commitment to keep their information secure. For the price surveys (both the commodities and services survey and the housing survey), the pledge is a follows:

The Bureau of Labor Statistics, its employees, agents, and partner statistical agencies, will use the information you provide for statistical purposes only and will hold the information in confidence to the full extent permitted by law. In accordance with the Confidential Information Protection and Statistical Efficiency Act of 2002 (Title 5 of Public Law 107-347) and other applicable Federal laws, your responses will not be disclosed in identifiable form without your informed consent.

Multiple confidentiality issues arise in the production of the CPI. One set of issues arises out of the need to prevent unauthorized access of data that are embargoed, or yet to be released to the public. Because the CPI data can affect financial markets, it is essential to ensure that no one without authorization has access to the data before release. BLS personnel who do see the data ahead of time are restricted by law from engaging in certain financial transactions during the period where they have seen data that are not public. Pre-release data are encrypted and always kept on secure servers, with any hard copies locked in secured areas.

In some cases, publication of indexes could risk revealing the pricing behavior of a respondent and publication of that data are suppressed according to specific thresholds.

Another set of issues relates to the protection of respondent identifying information (RII) and personally identifiable information (PII). We do not release price data from the sample to the public, nor do we confirm or deny that any specific product or brand is in the price sample, or that any specific store or seller is in the outlet sample.

CPI employees are trained annually on procedures to protect the security of embargoed data and the privacy of respondents. Additional information about CPI confidentiality procedures is available at the BLS Confidentiality of Data Collected by BLS for Statistical Purposes page.

Notes

1 For more information on Owners’ equivalent rent of primary residence (OER) and Rent of primary residence (Rent), please see https://www.bls.gov/cpi/factsheets/owners-equivalent-rent-and-rent.pdf.” Monthly Labor Review, April 2014

Last Modified Date: November 24, 2020