Department of Labor Logo United States Department of Labor
Dot gov

The .gov means it's official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Bureau of Labor Statistics > Office of Survey Methods and Research > Publications > Browse Research Papers

Bayesian Multiscale Multiple Imputation with Implications to Data Confidentiality

Scott H. Holan, Daniell Toth, Marco A. R. Ferreira, and Alan F. Karr

Abstract

Many scientific, sociological and economic applications present data that are collected on multiple scales of resolution. One particular form of multiscale data arises when data are aggregated across different scales both longitudinally and by economic sector. Frequently, such data sets experience missing observations in a manner that they can be accurately imputed using the method we propose known as Bayesian multiscale multiple imputation. This method borrows information both longitudinally and across different levels of aggregation to produce accurate imputations of missing observations as well as estimates that respect the constraints imposed by the multiscale nature of the data. Our approach couples dynamic linear models with a novel imputation step based on singular normal distribution theory. Although our method is of independent interest, one important implication of such methodology is its potential effect on confidential databases protected by means of cell suppression. In order to demonstrate the proposed methodology and to assess the effectiveness of disclosure practices in longitudinal databases, we conduct a large scale empirical study using the U.S. Bureau of Labor Statistics Quarterly Census of Employment and Wages (QCEW). During the course of our empirical investigation it is determined that several of the predicted cells are within 1 percent accuracy, thus causing potential concerns for data confidentiality.