An official website of the United States government
The U.S. Bureau of Labor Statistics (BLS) regularly explores ways to integrate alternative data sources and evaluate the fitness for use of these data for BLS data collection programs. In this article, we evaluate transactional data in the context of how it could be used to complement the Consumer Expenditure Surveys (CE). We first discuss the differences between transactional data and survey data and describe the challenges in developing a concordance to compare trends. Next, we compare spending changes in high-level industries by using the two matched sources over the period of January 2019 to December 2021. We find that in some industries the data sources show very similar trends (like food, accommodation, and entertainment), but data sources have a lower correlation in trends for other industries (like information and telecommunication). Ultimately, while we find that transactional data may provide insight into specific data trends, transactional data would be difficult to integrate with CE data because transactional data are organized by merchant type instead of by the type of good or service provided, and transactional data lack demographic details.
Government agencies produce a variety of statistical indicators that provide useful economic information for policymakers and the public to aid them in making informed decisions with reliable and high-quality data. There are several advantages to government agencies creating statistical indicators, including impartiality of the producers, national representation, and measurement of specific economic activities. For example, the Consumer Expenditure Surveys (CE), conducted by the U.S. Census Bureau for the U.S. Bureau of Labor Statistics (BLS), measure how households in the United States spend their money and provide spending statistics by a variety of sociodemographic characteristics.1
Government agencies face substantial challenges in measuring economic activity. Many statistics are constructed from in-person or telephone surveys that are expensive to regularly conduct. Additionally, declining survey response rates not only make it increasingly expensive to maintain precision but may also increase the potential for nonresponse bias in these surveys.2 Furthermore, surveys collected by using in-person methods tend to have a lag when it comes to producing statistics because of the amount of time required to collect, edit, and publish the data. Faced with these obstacles, many government agencies are turning to alternative sources of economic data to supplement or replace traditionally collected survey data.3
The COVID-19 pandemic, particularly the early period of 2020, introduced several challenges to collecting economic data. Risks of in-person transmission meant that surveyors had to shift from in-person data collection to telephone or online data collection. While traditional household surveys usually provide lagged data, this characteristic of such surveys posed a particular challenge. There was tremendous interest in assessing the impact of the pandemic on consumer spending but access to timely data was not available from the CE.
In contrast, the organization Opportunity Insights filled the void for relevant and timely information on consumer spending by leveraging transactional data to track consumer spending. Opportunity Insights is a not-for-profit organization that provides publicly available statistics that measure economic activity at a higher frequency than survey data collection. Opportunity Insights provides these timely statistics by using granular and anonymized private sector data. The organization, which is affiliated with Harvard University, is overseen by academic researchers. In 2020, Opportunity Insights began constructing indicators on consumer spending, employment, and business revenue and activity, with an emphasis on updating the data in a timely fashion. One of the goals of the researchers was to pass on their methodology and sources to government agencies who could produce real-time national accounts. Opportunity Insights approached the U.S. Bureau of Economic Analysis, the U.S. Census Bureau, and BLS in 2020 with this proposal and soon afterwards granted government economists access to Opportunity Insights data.4 Since 2020, the agencies have been evaluating Opportunity Insights data sources and methodology to determine whether and how they could be leveraged to produce official statistics.
This article considers the branch of the real-time national accounts project that focuses on BLS consumer spending analysis and evaluates consumer behavior during the unusual COVID-19 pandemic period, specifically from January 2019 to December 2021. Opportunity Insights consumer spending data are derived from credit card transaction data. Those data are described in more detail below.5 If Opportunity Insights’ measure of consumer spending is comparable to the CE results but with a shorter lag, then BLS could benefit from supplementing CE data with Opportunity Insights data. For example, one suggested use case would be to provide a preliminary result of average household spending estimates based on changes in spending measured by transactional data. Ultimately, however, our analysis did not find a clear use for transactional data because of inherent comparability issues between the two sources of data.
The CE are conducted on behalf of BLS by the U.S. Census Bureau. Typically, BLS releases tables of estimates of average annual spending and related public use microdata files in the fall following the year in which the data were collected. An additional mid-year release of the public use microdata occurs in the spring following the previous annual release and reflects an additional two quarters of spending.6 Data for the CE are collected by using two sources: an Interview Survey that captures a 3-month reference period of spending for a consumer unit (CU); and a Diary Survey in which CUs are expected to enter their daily spending into a diary for two consecutive 1-week periods.7 The Interview Survey is designed to be conducted in person, but data have been increasingly collected by telephone. Together, the two sources capture the full profile of consumer spending and represent what is spent on goods and services as well as allocations to personal insurance and pensions (including allocations to employer plans and Social Security). As previously mentioned, data from these two sources are combined to produce the published annual estimates. One of the primary uses of the CE is to provide weights for the Consumer Price Index (CPI). However, there are many other applications of CE data because the spending records contain detailed demographic information. More details on the CE and its methodology can be found in the BLS Handbook of Methods.8
CE data consist of all transactions for a given sample of CUs. This group of CUs is designed to be representative of the U.S. noninstitutionalized population. The transactions of each CU are assigned a BLS-designated category called a Universal Classification Code (UCC) that provides details on what good or service was purchased. It is important to note that the majority of CE data do not contain the merchant or store where these expenses were made or the specific method of payment. Although CE data do contain the merchant names for a select subset of expenditures and rotating subgroups of the sample to provide information for the CPI outlet frame, there are not sufficient data collected on merchant names to be useful for comparing CE data to transactional data.
Transactional data, particularly credit card transactions, are inherently different from survey collected data. And transactional data are not designed to measure any specific economic concept like a survey is. One of the primary uses of CE data is to provide weights for the CPI to reflect the spending for goods and services by consumers within select geographic areas. In CE data, all spending is considered, not only credit card spending. For example, consumer spending is aggregated by commodity and within geographic areas to produce weights used in generating the official Consumer Price Index for All Urban Consumers. Specifically, data are organized by different types of expenditures so that the CPI program can measure price changes for what good or service was purchased and the geographic area where the good or service was purchased. Conversely, credit card transactional data are organized by the merchants where the transactions took place and do not reflect purchases made with other means, such as cash or checks. These merchants can be organized into broader categories, such as food and beverage stores or gas stations. A drawback of credit card transactional data is that the detail of specific expenditures within a transaction is not available; therefore, transactional data is difficult to align into categories that are comparable to those in the CE.
Despite the inherent differences, there is still value in exploring the use of credit card transactional data. At the beginning of our study, we did not anticipate that a full replacement of the CE data by transactional data would be possible because credit card transactional data neither have the level of detail needed nor the various demographic variables available in the surveys. However, we did consider if Opportunity Insights consumer spending data could be used to produce aggregate-based CE estimates more frequently than the current 6-month and 12-month lagged expenditures. Opportunity Insights spending data are collected at a daily frequency and often published at a weekly frequency. If Opportunity Insights data are comparable to the CE, then BLS might be able to use Opportunity Insights data to impute changes in consumer spending for shorter periods of time than are possible with the current 6-month and 12-month CE data releases.
Another use considered for the transactional data is to evaluate errors in the CE. Given that any survey is subject to both sampling and measurement errors, we could identify differences in estimates that might identify sources of error within the CE. Specifically, portions of the CE rely first on the cooperation of respondents and then on their recall ability. An advantage of credit card transaction data is that it provides a direct measure of the amount spent and the date of purchase. However, there are also other types of errors that are associated with transactional data. Opportunity Insights obtained transactional data from a private firm that aggregates credit card and debit card data to support a variety of financial products for financial institutions (the “data aggregator”). One potential error in the samples of captured credit card data is that they lack geographic representativity. The transactional data obtained by Opportunity Insights covered only 10 percent of total credit card transactions. And there may be a systematic difference between transactions that are captured by and omitted from the data. In Opportunity Insights’ evaluation of the data, they find that the data aggregator’s transactional data overrepresent spending categories in which credit and debit card spending are used for purchases, which suggests that data captured in the transactional dataset will not be representative of all consumer transactions. If credit and debit card spending meaningfully differ from all consumer transactions, then indices derived from the transactional data may not reflect all consumer expenditures.
The first step in our evaluation to assess the usefulness of transactional data in the production of CE published estimates was to conceptually compare the data and to examine trends over time. If Opportunity Insights data contain a nonrepresentative sample of consumer spending or if the types of spending cannot easily be compared with the CE, then we could expect these two time series to display different patterns over time. In this case, there may be limited uses for Opportunity Insights data to complement the CE. Below, we describe the steps we took to align the data sources and create comparisons. Our goal was to determine whether there is utility in Opportunity Insights transactional data for the CE.
As noted above, Opportunity Insights obtained transactional credit card and debit card data from a firm that aggregates credit card and debit card data. Researchers at Opportunity Insights used the transactional data from the data aggregator to measure changes in consumer spending during the pandemic. The transactional dataset contains about 10 percent of credit and debit card spending in the United States, and the data aggregator collects information about the date, amount, and merchant code of the transaction.9 The data aggregator sends Opportunity Insights an updated extract of data on a weekly basis, which allows researchers to construct spending statistics with a relatively short time lag.
The transactional data do not contain any demographic information for the households associated with the transactions other than the zip code of their residence.10 Thus, it is difficult to compare spending for individual households or demographic groups between the transactional dataset and the CE. Because of these limitations in the correspondence of information collected in the datasets, our following analyses will focus on comparing aggregates in consumer spending.
The underlying information captured by credit card transactions includes the following details: the businesses where the transactions occurred along with Merchant Category Codes (MCCs), the dates of transactions, and the amounts of each transaction. The MCCs are a set of codes used by credit card companies to classify the types of goods and services that a business or a company offers. MCCs relate solely to the business and how it is categorized and not with the specific categories of goods or services that were purchased. In the dataset provided to Opportunity Insights, consumer spending is aggregated to higher levels of categorization according to the MCCs of the transactions for confidentiality reasons. The data aggregator has collaborated with Opportunity Insights to develop a mapping from MCCs to the business sector. The sector classification schema was originally not related to other definitions, but, in 2021, the U.S. Census Bureau and the U.S. Bureau of Economic Analysis assisted Opportunity Insights in mapping sectors listed in the data aggregator's dataset to be more closely aligned to North American Industry Classification System (NAICS) codes. For example, Opportunity Insights now reports spending on the sector FBS––food and beverage stores––which is comparable to NAICS code 445.
The central dilemma when comparing Opportunity Insight’s data with CE data is that spending is assigned to merchants and sectors, not goods and services. While some sectors only sell a few types of goods or services, other sectors sell many types of goods or services. For example, if a consumer makes a $40.15 transaction at a gasoline station, the transaction is reported under the Opportunity Insight’s sector GAS in the given zip code and on the given date. However, the consumer could have purchased more than just gasoline or no gasoline at all. That is, the customer could have purchased gasoline, cigarettes, nonalcoholic beverages, or some combination of the three. In the scenario in which a consumer purchased a combination of all items, it would still only be reflected as GAS in the transactional dataset. If the same consumer was a respondent in the CE, then they would have reported their transaction as three separate UCCs. In order to truly compare CE spending reports to data from the data aggregator, we would need a method to allocate transactions from the data aggregator's data to their relevant UCCs in the CE.11
While the data aggregator preserves merchant confidentiality by aggregating to business sectors, it preserves customer confidentiality by only reporting credit card transactions at the zip-code level. In other words, individual customer spending data are not available. Aggregated purchases are assigned to the county and the median income quartile of the zip code for the cardholder’s residence. In order to preserve anonymity of the customers, Opportunity Insights will group these transactions under the zip-code income quartile category REST. Additionally, the data aggregator only reports county, sector, and date for zip-code income quartile observations in which there are at least five transactions made with a credit or debit card, and cells with fewer than five transactions made with a credit or debit card are aggregated into a higher level and are assigned a value of REST for the category. For example, if the date, county, and MCC recorded for the zip-code quartile cell only contains three observations, then the data aggregator will report only the date, county, and MCC.
One consequence of transaction data and the confidentiality policy is that we cannot determine whether the spending recorded in the transactional dataset is nationally representative. Records in the dataset are not reported at the card, person, or household level but at the zip-code level, which prevents us from assessing the composition of the cardholders. Additionally, and as noted above, spending is assigned by the type of merchant, rather than the type of good or service purchased. Therefore, in the remainder of the article, we do not attempt to match the level of spending in the CE with the Opportunity Insights dataset. Rather, we calculate the month-to-month percent change in consumer spending between the CE and the Opportunity Insights dataset.
Because of the underlying differences in the data sources, creating a comparison of the CE dataset with the Opportunity Insights dataset required several assumptions. Thus, we faced several challenges with respect to comparability.
First, the surveys in the CE are used to collect data to reflect different reference periods depending on the source (Interview or Diary) and expenditure type. For the Interview Survey, the microdata files record expenditures as monthly values. However, recorded expenditures could have been collected for a shorter or longer period of time. For example, if data are collected with reference to the last 3 months, the quarterly expenditure may be split evenly across all months or be randomly assigned to a month within the 3-month reference period. For the Diary Survey, data are collected as a weekly expense and extrapolated as a monthly expense. This extrapolation was necessary in order to align the Diary Survey with the Interview Survey and to compare the CE dataset with the Opportunity Insights dataset. The effect of these adjustments on the subsequent comparisons is that the monthly measured change in expenditures may be more volatile (for months randomly assigned), muted (when quarterly expenditures are distributed across months), or extreme (for expenditures inflated to a monthly value in the Diary Survey).
A second challenge was that the CE data contain sparse observations for some of the comparable Opportunity Insights sectors that are aligned to NAICS-based industries. In some cases, we further aggregate the Opportunity Insights sectors that are aligned to NAICS-based industries to a broader classification to ensure that we have sufficient observations when comparing CE data to Opportunity Insights data.12 This also serves the purpose of allowing more flexibility in assigning UCCs to spending categories.
The third, and core, challenge in comparing the data sources was developing a concordance for the comparison of expenditures. In order to develop the concordance, we had to map CE UCCs to the Opportunity Insights sectors that are aligned to NAICS-based industries. As mentioned previously, CE data are organized by good or service and not by the merchant where a purchase was made, which is how the data aggregator's data and NAICS codes are oriented. To assist in our analysis, we used an existing concordance developed by BLS for comparing UCC categories from the CE with Personal Consumption Categories from the U.S. Bureau of Economic Analysis.13 Ultimately, the final concordance we used was created by manually reviewing each UCC used to create the CE expenditure tables and matching it to a likely NAICS code.14 As a result, there were some expenditures that may have been purchased by the CU at a merchant in one industry that instead were assigned to a different alternate industry. The impact of this approach is lessened by aggregating industries together. However, there are still some discrepancies that could remain even at the high level of aggregation that we use for our comparisons.
For example, consider the purchase of a television, which may have been bought by a respondent from a big-box store (general merchandise retailers) or from a specialty electronics store (electronics and appliance retailers). For our purpose, both purchase locations map to the higher level of aggregation that we present as nonfood retail, and there will not be any discrepancies in the aggregate. However, to use another example, a consumer could also purchase groceries at a big-box store (general merchandise retailers) or at a grocery store (food and beverage stores), and these purchases would map to the two following separate categories: nonfood retail or food, accommodation, and entertainment, respectively. The following extreme example demonstrates the impact of these discrepancies: if all CUs doubled their spending on groceries and kept spending on other categories constant, the CE data in the comparisons would show the increased spending in the food, accommodation, and entertainment category and no change in the nonfood retail category. But if a consumer purchases food items at both big-box retail stores and grocery stores, then this increase would be reflected in the transactional data for both the food, accommodation, and entertainment category and the nonfood retail category. This extreme example demonstrates how differences in our comparisons may relate to these underlying assumptions.
The final assumption that must be made when comparing changes in the CE data to the Opportunity Insights data is that spending with cash or check changes at rates similar to spending with credit or debit cards. This is a necessary assumption because the CE does not include means of payment in the data.
Because Opportunity Insights data are aggregated across households and not representative of all spending, our measure of analysis was the change in spending over time. To conduct the analysis, we aggregated spending to categories that we determined are the most comparable for both Opportunity Insights data and CE data. Thus, our measure was based on the summing of spending in the CE by using the UCC codes mapped to the Opportunity Insights concordance file to obtain an industry-based measure of spending. We used this measure of spending to construct a month-to-month rate of change. As the Opportunity Insights data is already organized by a NAICS-based classification system, creating aggregate credit card spending by industry is a simpler task. In order to construct a rate of change, we first summed all credit card spending by month for each NAICS-based industry. Next, as mentioned in the prior section, we further aggregated the sectors to a broader industry definition (e.g., healthcare).15 Finally, we combined the CE data and Opportunity Insights data by month and broad industry. Although we do have the transaction value for the Opportunity Insights data, we lack the information to produce nationally representative population aggregates or averages. As a result, the meaningful statistic that can be derived from the Opportunity Insights data is the rate of change in the underlying spending.
Our primary test of the reliability of the Opportunity Insights data is whether it yields similar expenditure changes as in the CE data. In the charts below, we compare the resulting month-on-month percent changes in national expenditures by industry classification between the CE dataset (integrated Interview and Diary data) and the Opportunity Insights dataset for January 2019 through December 2021.16 The industry classifications are as follows: food, accommodation, and entertainment; food and beverage stores; healthcare; information and telecommunication; nonfood retail; public administration and government; rental, administration, and other services; transportation and warehousing; and utilities, construction, and manufacturing.
The similarity of Opportunity Insights and CE trends varies by sector. In some of the industry sectors, such as food, accommodations, and entertainment, CE data and Opportunity Insights data are remarkably similar in terms of change. For other sectors, such as information and telecommunication, changes in the Opportunity Insights data are substantially more volatile than changes in the CE data. This discrepancy may be caused in part by lumpiness in credit and debit card spending in the public administration and government and the information and telecommunications categories. If consumers make large, infrequent purchases in these categories, then there may be large swings in month-to-month spending, even as annual spending is relatively stable. Spending on public administration and information and technology in the Opportunity Insights dataset also appears quite seasonal. However, it is unclear why spending in the CE dataset is not similarly volatile.
The COVID-19 pandemic substantially affected consumer spending behavior, which initiated the original Opportunity Insights collaboration with the data aggregator. For example, there was a noticeable decline in spending on food, accommodation, and entertainment during March 2020 in both the Opportunity Insights and the CE data. And there was a brief recovery in spending in subsequent months. In contrast, nonfood retail spending dropped sharply in January 2020 and then fluctuated throughout the rest of the year. These oscillations could reflect seasonal factors, such as holiday shopping in November and December. Nevertheless, these changes reflect the importance of accurately capturing variations in consumer spending.
The results of these comparisons are not consistent. While the similarities between the data sources are very similar in some sectors, other sectors are quite different. This leaves us with a lack of clarity about how we could use this type of transactional data in the CE.
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –1.784 | –2.650 |
| Mar 2019 | 12.557 | 17.966 |
| Apr 2019 | 23.391 | –9.010 |
| May 2019 | –27.852 | 7.115 |
| Jun 2019 | 5.726 | 2.379 |
| Jul 2019 | 13.366 | 1.247 |
| Aug 2019 | –11.694 | –1.918 |
| Sep 2019 | 0.725 | –7.275 |
| Oct 2019 | –4.784 | 0.517 |
| Nov 2019 | –11.925 | –6.437 |
| Dec 2019 | 17.111 | 0.938 |
| Jan 2020 | –15.360 | 1.784 |
| Feb 2020 | 7.799 | 0.268 |
| Mar 2020 | –39.313 | –35.090 |
| Apr 2020 | –75.322 | –58.965 |
| May 2020 | 30.636 | 40.321 |
| Jun 2020 | 28.920 | 24.173 |
| Jul 2020 | 20.939 | 7.518 |
| Aug 2020 | –0.514 | 5.320 |
| Sep 2020 | –0.097 | –3.965 |
| Oct 2020 | 2.782 | 4.520 |
| Nov 2020 | –24.884 | –14.495 |
| Dec 2020 | 13.307 | –1.739 |
| Jan 2021 | 4.204 | 9.762 |
| Feb 2021 | –6.515 | –1.035 |
| Mar 2021 | 26.073 | 21.347 |
| Apr 2021 | 9.862 | 2.534 |
| May 2021 | 3.630 | 11.274 |
| Jun 2021 | 13.263 | 2.728 |
| Jul 2021 | 12.667 | 6.651 |
| Aug 2021 | –12.507 | –7.985 |
| Sep 2021 | –16.315 | –6.488 |
| Oct 2021 | 7.654 | 7.434 |
| Nov 2021 | –15.901 | –11.827 |
| Dec 2021 | 11.181 | 1.394 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –1.911 | –5.980 |
| Mar 2019 | 0.094 | 8.813 |
| Apr 2019 | 13.670 | –3.909 |
| May 2019 | –3.751 | 4.876 |
| Jun 2019 | –12.025 | –1.799 |
| Jul 2019 | 6.925 | 2.185 |
| Aug 2019 | –0.683 | 0.719 |
| Sep 2019 | 3.480 | –5.966 |
| Oct 2019 | 6.180 | 2.465 |
| Nov 2019 | –17.477 | 6.150 |
| Dec 2019 | 22.001 | 8.591 |
| Jan 2020 | –21.347 | –15.298 |
| Feb 2020 | 4.246 | –1.667 |
| Mar 2020 | 22.177 | 28.436 |
| Apr 2020 | –14.718 | –12.625 |
| May 2020 | 4.142 | 6.032 |
| Jun 2020 | –13.376 | –7.807 |
| Jul 2020 | 16.072 | 4.396 |
| Aug 2020 | 7.698 | -1.044 |
| Sep 2020 | –17.430 | –5.520 |
| Oct 2020 | 12.464 | 4.975 |
| Nov 2020 | –3.899 | 1.862 |
| Dec 2020 | 3.146 | 11.795 |
| Jan 2021 | 4.852 | –12.949 |
| Feb 2021 | –11.450 | –9.855 |
| Mar 2021 | 6.755 | 5.999 |
| Apr 2021 | –14.003 | –4.170 |
| May 2021 | 14.485 | 4.877 |
| Jun 2021 | 0.354 | –5.981 |
| Jul 2021 | 6.621 | 4.285 |
| Aug 2021 | –10.296 | –0.487 |
| Sep 2021 | 1.252 | –3.892 |
| Oct 2021 | 0.868 | 4.983 |
| Nov 2021 | –4.490 | –0.310 |
| Dec 2021 | 15.424 | 12.090 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –11.951 | –4.603 |
| Mar 2019 | 9.088 | 7.921 |
| Apr 2019 | –19.461 | 0.090 |
| May 2019 | 20.953 | –1.147 |
| Jun 2019 | –15.798 | –7.947 |
| Jul 2019 | 5.333 | 5.429 |
| Aug 2019 | –5.363 | –0.415 |
| Sep 2019 | 6.766 | –5.358 |
| Oct 2019 | 9.470 | 9.126 |
| Nov 2019 | –3.368 | –12.425 |
| Dec 2019 | –1.698 | –5.128 |
| Jan 2020 | –6.889 | 22.820 |
| Feb 2020 | –3.032 | –7.899 |
| Mar 2020 | 1.451 | -20.004 |
| Apr 2020 | –51.744 | –51.901 |
| May 2020 | 22.324 | 32.487 |
| Jun 2020 | 15.876 | 30.664 |
| Jul 2020 | 1.307 | 4.872 |
| Aug 2020 | 1.291 | –0.837 |
| Sep 2020 | –19.691 | 2.931 |
| Oct 2020 | 25.296 | 4.215 |
| Nov 2020 | 2.682 | –9.623 |
| Dec 2020 | –17.688 | 1.755 |
| Jan 2021 | 8.831 | 11.499 |
| Feb 2021 | 8.441 | –4.850 |
| Mar 2021 | 16.110 | 17.196 |
| Apr 2021 | –12.556 | –7.743 |
| May 2021 | –11.698 | –7.486 |
| Jun 2021 | 14.991 | 5.969 |
| Jul 2021 | 8.893 | –3.678 |
| Aug 2021 | –6.667 | 3.648 |
| Sep 2021 | –1.476 | –2.421 |
| Oct 2021 | 7.596 | –0.576 |
| Nov 2021 | 5.897 | –3.215 |
| Dec 2021 | –15.628 | –4.555 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –0.569 | –7.540 |
| Mar 2019 | –0.044 | 12.496 |
| Apr 2019 | 1.387 | –5.552 |
| May 2019 | 0.315 | 3.336 |
| Jun 2019 | –1.286 | –1.799 |
| Jul 2019 | –1.089 | 3.530 |
| Aug 2019 | –0.540 | –1.291 |
| Sep 2019 | 1.503 | –0.413 |
| Oct 2019 | 1.479 | 2.657 |
| Nov 2019 | 0.022 | –1.202 |
| Dec 2019 | –0.326 | 1.083 |
| Jan 2020 | 0.735 | 3.804 |
| Feb 2020 | 0.622 | –6.605 |
| Mar 2020 | 0.941 | 3.222 |
| Apr 2020 | 0.635 | –1.271 |
| May 2020 | 0.605 | 2.381 |
| Jun 2020 | –0.582 | –3.171 |
| Jul 2020 | –0.680 | 4.410 |
| Aug 2020 | –0.984 | –1.289 |
| Sep 2020 | 0.394 | –0.024 |
| Oct 2020 | 0.485 | 2.010 |
| Nov 2020 | 0.600 | –0.649 |
| Dec 2020 | 1.402 | 4.539 |
| Jan 2021 | –0.803 | 5.491 |
| Feb 2021 | –0.787 | –11.490 |
| Mar 2021 | –1.035 | 9.308 |
| Apr 2021 | 0.273 | –8.529 |
| May 2021 | –0.218 | –0.546 |
| Jun 2021 | 0.532 | –1.467 |
| Jul 2021 | –1.127 | 4.533 |
| Aug 2021 | -0.453 | –2.653 |
| Sep 2021 | 0.288 | 0.675 |
| Oct 2021 | 0.481 | 4.607 |
| Nov 2021 | –0.195 | –3.371 |
| Dec 2021 | –0.004 | 3.154 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | 4.071 | -7.100 |
| Mar 2019 | 11.910 | 15.218 |
| Apr 2019 | 9.770 | -1.225 |
| May 2019 | 3.570 | 6.500 |
| Jun 2019 | -5.417 | -6.061 |
| Jul 2019 | -3.001 | 2.524 |
| Aug 2019 | 1.018 | 0.920 |
| Sep 2019 | 2.431 | -7.066 |
| Oct 2019 | -1.720 | 4.155 |
| Nov 2019 | 2.417 | 8.460 |
| Dec 2019 | 12.050 | 10.827 |
| Jan 2020 | -28.897 | -25.262 |
| Feb 2020 | -1.260 | -4.675 |
| Mar 2020 | -1.824 | -1.174 |
| Apr 2020 | -6.834 | -8.831 |
| May 2020 | 23.757 | 25.085 |
| Jun 2020 | -7.656 | 4.310 |
| Jul 2020 | 13.933 | 2.674 |
| Aug 2020 | 4.260 | 0.165 |
| Sep 2020 | -11.066 | -3.439 |
| Oct 2020 | -2.071 | 4.145 |
| Nov 2020 | -0.289 | 7.404 |
| Dec 2020 | 20.973 | 8.060 |
| Jan 2021 | -22.235 | -18.261 |
| Feb 2021 | -1.024 | -9.870 |
| Mar 2021 | 13.019 | 19.297 |
| Apr 2021 | 3.134 | -2.504 |
| May 2021 | 3.120 | 2.673 |
| Jun 2021 | -7.444 | -3.875 |
| Jul 2021 | 22.014 | -0.370 |
| Aug 2021 | -1.580 | -0.099 |
| Sep 2021 | -16.171 | -2.999 |
| Oct 2021 | -5.243 | 5.974 |
| Nov 2021 | -7.795 | 9.025 |
| Dec 2021 | 22.396 | 2.284 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –0.011 | –15.820 |
| Mar 2019 | –1.258 | 20.927 |
| Apr 2019 | 0.140 | 36.991 |
| May 2019 | 2.318 | –52.181 |
| Jun 2019 | –0.303 | –1.042 |
| Jul 2019 | –3.032 | –1.006 |
| Aug 2019 | 0.622 | –1.207 |
| Sep 2019 | 2.627 | –5.453 |
| Oct 2019 | 0.487 | 9.562 |
| Nov 2019 | 0.999 | –9.838 |
| Dec 2019 | –0.723 | 17.742 |
| Jan 2020 | –1.467 | 2.092 |
| Feb 2020 | 4.035 | –13.987 |
| Mar 2020 | 4.089 | 2.581 |
| Apr 2020 | 1.161 | –4.495 |
| May 2020 | 1.536 | –5.564 |
| Jun 2020 | –0.503 | 14.391 |
| Jul 2020 | –0.535 | 28.475 |
| Aug 2020 | 0.132 | –32.309 |
| Sep 2020 | –1.450 | 0.407 |
| Oct 2020 | 1.293 | 5.892 |
| Nov 2020 | –2.460 | –6.566 |
| Dec 2020 | 7.180 | 18.258 |
| Jan 2021 | –3.748 | –1.060 |
| Feb 2021 | –0.057 | –16.778 |
| Mar 2021 | –0.265 | 21.792 |
| Apr 2021 | 0.231 | –1.707 |
| May 2021 | 1.684 | 9.044 |
| Jun 2021 | –0.507 | –25.099 |
| Jul 2021 | 0.461 | –2.964 |
| Aug 2021 | 1.157 | 0.757 |
| Sep 2021 | 1.146 | –3.463 |
| Oct 2021 | –0.214 | 9.447 |
| Nov 2021 | –0.653 | –5.266 |
| Dec 2021 | 1.669 | 8.833 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –10.534 | –8.072 |
| Mar 2019 | 3.614 | 11.252 |
| Apr 2019 | 3.097 | –3.753 |
| May 2019 | –0.068 | 1.385 |
| Jun 2019 | –2.975 | –4.654 |
| Jul 2019 | –1.378 | 4.655 |
| Aug 2019 | 18.635 | 2.667 |
| Sep 2019 | –13.227 | –8.098 |
| Oct 2019 | –5.934 | 1.840 |
| Nov 2019 | 4.770 | –7.646 |
| Dec 2019 | 16.789 | 4.584 |
| Jan 2020 | –7.781 | 10.375 |
| Feb 2020 | –10.186 | –5.804 |
| Mar 2020 | –2.580 | –8.318 |
| Apr 2020 | –0.074 | –16.170 |
| May 2020 | –3.702 | 12.210 |
| Jun 2020 | 4.960 | 9.675 |
| Jul 2020 | 6.177 | 4.959 |
| Aug 2020 | 5.737 | 2.836 |
| Sep 2020 | –1.506 | –3.122 |
| Oct 2020 | –4.231 | 2.373 |
| Nov 2020 | –4.845 | –8.681 |
| Dec 2020 | 20.899 | 8.483 |
| Jan 2021 | –11.659 | 4.535 |
| Feb 2021 | –9.351 | -7.585 |
| Mar 2021 | 2.893 | 16.752 |
| Apr 2021 | 7.248 | –5.390 |
| May 2021 | –5.165 | 0.241 |
| Jun 2021 | 0.490 | –1.065 |
| Jul 2021 | 1.580 | 0.812 |
| Aug 2021 | 8.116 | 1.687 |
| Sep 2021 | –7.462 | –4.909 |
| Oct 2021 | –4.561 | 1.071 |
| Nov 2021 | 3.075 | –3.384 |
| Dec 2021 | 15.538 | 2.161 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –12.475 | –4.626 |
| Mar 2019 | 20.226 | 15.167 |
| Apr 2019 | –9.590 | –8.174 |
| May 2019 | 6.354 | 2.838 |
| Jun 2019 | 10.607 | –2.254 |
| Jul 2019 | 12.304 | –2.920 |
| Aug 2019 | –8.012 | –5.807 |
| Sep 2019 | –11.119 | –2.169 |
| Oct 2019 | –4.622 | 2.326 |
| Nov 2019 | -1.718 | –11.049 |
| Dec 2019 | 12.565 | –7.368 |
| Jan 2020 | –6.208 | 24.990 |
| Feb 2020 | –30.552 | –10.086 |
| Mar 2020 | –32.830 | –50.225 |
| Apr 2020 | –80.753 | –78.305 |
| May 2020 | –31.028 | 28.921 |
| Jun 2020 | 28.193 | 21.378 |
| Jul 2020 | 40.080 | –9.360 |
| Aug 2020 | –22.257 | 1.202 |
| Sep 2020 | 0.303 | 6.803 |
| Oct 2020 | 1.284 | 8.617 |
| Nov 2020 | 21.284 | –14.568 |
| Dec 2020 | –6.748 | –6.640 |
| Jan 2021 | –7.990 | 12.116 |
| Feb 2021 | 10.533 | 1.325 |
| Mar 2021 | 9.840 | 34.812 |
| Apr 2021 | 46.987 | 5.238 |
| May 2021 | 7.177 | 13.099 |
| Jun 2021 | 20.441 | 10.568 |
| Jul 2021 | 34.444 | –3.237 |
| Aug 2021 | –27.845 | –14.729 |
| Sep 2021 | –18.143 | –0.261 |
| Oct 2021 | –0.676 | 13.021 |
| Nov 2021 | –0.367 | –3.270 |
| Dec 2021 | 14.182 | –16.858 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
| Date | Consumer Expenditure Survey | Transactional data |
|---|---|---|
| Jan 2019 | – | – |
| Feb 2019 | –0.717 | –4.714 |
| Mar 2019 | –2.775 | 10.187 |
| Apr 2019 | –2.785 | –1.487 |
| May 2019 | –0.541 | 0.766 |
| Jun 2019 | 15.788 | –3.455 |
| Jul 2019 | –5.059 | 12.056 |
| Aug 2019 | 7.211 | -0.323 |
| Sep 2019 | –7.923 | –3.410 |
| Oct 2019 | –1.721 | 5.441 |
| Nov 2019 | –3.061 | –16.197 |
| Dec 2019 | 1.558 | –6.442 |
| Jan 2020 | 3.132 | 14.955 |
| Feb 2020 | –2.425 | –6.029 |
| Mar 2020 | 1.035 | 1.105 |
| Apr 2020 | –6.970 | –5.578 |
| May 2020 | 8.813 | 6.167 |
| Jun 2020 | 5.598 | 12.136 |
| Jul 2020 | 12.898 | 8.695 |
| Aug 2020 | –8.651 | –0.410 |
| Sep 2020 | –7.925 | –0.349 |
| Oct 2020 | 6.274 | 1.467 |
| Nov 2020 | –3.613 | –16.299 |
| Dec 2020 | -2.707 | 1.656 |
| Jan 2021 | 6.303 | 10.039 |
| Feb 2021 | –4.607 | –2.770 |
| Mar 2021 | 10.760 | 13.431 |
| Apr 2021 | –19.397 | –9.579 |
| May 2021 | 15.335 | –5.229 |
| Jun 2021 | 2.393 | 6.197 |
| Jul 2021 | 2.794 | 3.104 |
| Aug 2021 | 3.058 | 3.993 |
| Sep 2021 | –5.558 | –1.341 |
| Oct 2021 | 0.600 | 1.053 |
| Nov 2021 | –13.821 | –10.347 |
| Dec 2021 | 10.841 | –2.525 |
| Note: Dash indicates not applicable. Source: Data aggregator’s data and U.S. Bureau of Labor Statistics. | ||
Opportunity Insights data offer opportunities to rapidly augment consumer spending data by using changes in the transactional data to provide preliminary estimates of CE data. However, using Opportunity Insights data to complement the CE poses several challenges. The greatest problem is that the CE, and other BLS products such as the CPI, are defined by the type of good or service purchased, while the Opportunity Insights data are organized by the industry of the merchant who made the transaction. Since businesses can sell a wide variety of products––or the same products can be purchased from very different types of businesses––any link between spending documented in Opportunity Insights data and in CE data will be imprecise. For example, the correlation between the two datasets in information technology and healthcare is quite low. Nevertheless, our calculations find that trends in the Opportunity Insights data and CE data strongly agree in some broad industry groups, namely in food, accommodation, and entertainment. Thereby, given the data limitations of Opportunity Insights transactional data, we conclude that Opportunity Insights data would not be able to replace CE data.
We believe that some alternative data, such as private sector data, can assist BLS in completing its mission to measure labor market activities and price changes. However, our findings in this article demonstrate that some of the characteristics of alternative data need to be accounted for before these data sources can successfully supplement current public statistics. Nonetheless, one possible use of alternative data is identifying emerging trends in consumer spending, particularly for those categories that we have identified as most comparable. Accordingly, this article can assist in identifying the potential use of alternative transactional datasets for public statistical agencies.
Acknowledgement: We are thankful for the assistance of Thesia I. Garner in conducting this study. We also thank the participants of the research collaboration between Opportunity Insights, U.S. Bureau of Economic Analysis, U.S. Bureau of Labor Statistics, and the U.S. Census Bureau.
Laura Erhard, and Hugh Montag, "The value of transactional data to a consumer spending survey," Monthly Labor Review, U.S. Bureau of Labor Statistics, January 2026, https://doi.org/10.21916/mlr.2026.3
1 The U.S. Bureau of Labor Statistics (BLS) produces tables of expenditure means as well as public use microdata based on Consumer Expenditure Surveys (CE) results. For more information, see “Consumer Expenditure Surveys” (U.S. Bureau of Labor Statistics), https://www.bls.gov/cex/.
2 The recent report The Nation's Data at Risk documents declining response rates to sample surveys across federal statistical agencies and rising costs. See The Nation’s Data at Risk: Meeting America’s Information Needs for the 21st Century (American Statistics Association, 2024), https://www.amstat.org/policy-and-advocacy/the-nation's-data-at-risk-meeting-american's-information-needs-for-the-21st-century.
3 BLS has explored several sources of alternative data sources relevant to the CE, including federal tax information and commercially sourced housing characteristics such as property tax and housing values and transactional data from Nielsen. For further details, see Laura Erhard, Brett McBride, and Adam Safir, “A framework for the evaluation and use of alternative data in the Consumer Expenditure Surveys,” Monthly Labor Review, February 2021, https://doi.org/10.21916/mlr.2021.2.
4 See “Developing experimental statistics to measure economic activity in real time” (Federal Economic Statistical Advisory Committee (FESAC) meeting, June 2021), https://apps.bea.gov/fesac/meetings/2021-06-11/Dunn-DESMEA-FESAC-2021-final.pdf.
5 In this article, the term “credit card transactions” refers to both credit and debit card transactions because the Opportunity Insights data are derived from transactions that were conducted with both.
6 In 2025, the BLS temporarily suspended the mid-year release of the data.
7 A consumer unit (CU) is defined as a group of individuals within a household that are related by blood or legal arrangement or who share major expenses. A CU is the same as a household 98 percent of the time. See “Consumer unit,” Glossary (U.S. Bureau of Labor Statistics), https://www.bls.gov/bls/glossary.htm.
8 “Consumer expenditures and income: overview,” Handbook of Methods (U.S. Bureau of Labor Statistics, last modified September 12, 2022), https://www.bls.gov/opub/hom/cex/home.htm.
9 Raj Chetty, John Friedman, Michael Stepner, The Opportunity Insights Team, “The Economic impacts of COVID-19: evidence from a new public database built using private sector data,” The Quarterly Journal of Economics, vol. 139, no. 2, May 2024, pp. 829–889, https://doi.org/10.1093/qje/qjad048, p. 839.
10 Opportunity Insights researchers added demographic information for geographic areas based on the 2014–2018 American Community Survey to the data file.
11 The U.S. Bureau of Economic Analysis has developed a method to allocate different product lines to industries for the National Income and Product Accounts, but we chose not to pursue those methods at this time. See “Chapter 5: personal consumption expenditures,” Concepts and Methods of the U.S. National Income and Product Accounts (U.S. Bureau of Economic Analysis, December 2024), https://www.bea.gov/resources/methodologies/nipa-handbook/pdf/all-chapters.pdf#page=96.
12 For example, the data aggregator reports some transactions as Ambulatory Health Care Services (AMB), which are associated as North American Industry Classification System (NAICS) code 621. We aggregate these transactions with other healthcare transactions to generate the category Healthcare (HCE). We perform this aggregation because some Universal Classification Codes (UCCs) in the CE do not neatly map onto NAICS codes that are used by the data aggregator.
13 The CE Data Comparisons page for Personal Consumption Expenditures has more details; see “Personal Consumption Expenditures,” Consumer Expenditure Surveys (U.S. Bureau of Labor Statistics, last modified November 9, 2023), https://www.bls.gov/cex/cecomparison/pce_profile.htm.
14 For the final concordance we use, see source data for Supplement A: Mapping of CE UCCs to industries.
15 See source data for Supplement B: Mapping of the data aggregator's categories to industries.
16 The CE publications are based on both an Interview Survey and a Diary Survey that are integrated based on the “best” sources of the expenditure to create a full profile of spending. We follow the sources used for the published CE tables when creating our expenditure estimates. For more information on integration, see “Integrated survey data,” Consumer Expenditures and Income: Calculation, Handbook of Methods (U.S. Bureau of Labor Statistics, last modified September 12, 2022), https://www.bls.gov/opub/hom/cex/calculation.htm#integrated-survey-data.