Evaluation of Patterns of Missing Prices in CPI Data

Harold Gomes

Abstract

The validity of an imputation method to represent missing data depends on the assumptions made about the underlying data and the mechanism of missingness. In this empirical investigation, patterns of missing prices in the Consumer Price Index (CPI) microdata are evaluated, i.e., missing prices in relations to other covariates (auxiliary variables) are assessed. CPI is an official statistic that measures U.S. inflation and is estimated based on a multistage probability sample design. Price of a quote (item) is the variable of interest for the CPI target population, collected monthly from a representative market basket. CPI microdata are used to evaluate the missingness mechanism: Missing Completely at Random (MCAR), Missing at Random (MAR), or Missing Not at Random (MNAR). Exploratory analysis, statistical tests, and data visualization are used in this study. A few important benefits of this research are: 1) to examine the validity of current imputation methods that use group means imputation with periodic updates to the group definitions; 2) to identify variables related to missingness in MAR situations; 3) to provide potential recommendations for future improvement.