Longitudinal Microdata Outlier Detection Techniques

Eric Simants


With more than 8.5 million new records processed each quarter the Bureau of Labor Statistics' Longitudinal Database (LDB) is one of the most comprehensive business registry lists in existence. The LDB contains business establishment data since 1990 and contains over 400 million records. Since the LDB serves as the sample frame for the Bureau's establishment based surveys, the publication of Business Employment Dynamics data, and as a research database for economists, the relevancy and usefulness of the data rely primarily on the timeliness and accuracy with which they are collected, cleaned, stored, and reported. The data are collected by each state from their Unemployment Insurance system. These data are reviewed through a rigorous process. One technique used is screening each record through a series of conditional edits based on deviations from prior values of the variable of interest.