
An official website of the United States government
The Producer Price Index at the Bureau of Labor Statistics currently uses cell mean imputation for missing
price data. In the time since the implementation of the current process, multiple imputation methods have
become much easier to use on large data sets. In this study, we investigate alternatives to the current
procedure. We examined a few different multiple imputation methods with packages in R, including: CART,Random Forest, and AMELIA (bootstrap EM algorithm). We also introduce a hybrid imputation method combining both cell mean and random forest techniques. Success of imputation for the missing prices was measured by RMSE (Root Mean Squared Error) of PPI Index estimates. Results from the study will be discussed.