Sample Size Optimization of the Consumer Price Index: An Implementation using R

Harold Gomes and William H. Johnson

Abstract

The Consumer Price Index (CPI) is estimated based on a multistage probability sampling design. To collect the optimal number of Items and Outlets across the United States, a nonlinear constrained optimization method, known as the Item-Outlet Optimization Program (IOOP), has been used. IOOP calculates optimal sample sizes for the commodities and services component of CPI, about 70% of the CPI weight. Previous BLS literature has described the mathematical basis of this method. Currently, CPI uses SAS for computation. In this study, we provide useful technical details for practical implementation in R, intuitive interpretation and infographics. What makes IOOP unique compared to classic methods, such as Neyman allocation, is its level of practicality and complexity. IOOP generates optimal sample sizes to minimize the overall CPI variance while maintaining fixed budgets and scope, as well as other constraints. The fixed scope is essentially the parameters—labor hours, travel time, response rates, etc.—to account for the reality of data collection. The R implementation provides a validation of SAS results, and the details can be beneficial to agencies seeking a method to account for scope. Key Words: Consumer Price Index (CPI), nonlinear optimization,