Chart 1. Distribution of cosine similarity scores for selected Quarterly Census of Employment and Wages matches, weighted by agency size
1 – cosine similarity Cumulative distribution

0.01187

0.00013

0.03184

0.00021

0.05180

0.00021

0.07176

0.00022

0.09172

0.00022

0.11169

0.00022

0.13165

0.00022

0.15161

0.00022

0.17157

0.00022

0.19153

0.00022

0.21150

0.00022

0.23146

0.00022

0.25142

0.00046

0.27138

0.00154

0.29134

0.00703

0.31131

0.01406

0.33127

0.01741

0.35123

0.02559

0.37119

0.02874

0.39116

0.03134

0.41112

0.04233

0.43108

0.05526

0.45104

0.05691

0.47100

0.05904

0.49097

0.10348

0.51093

0.11603

0.53089

0.13312

0.55085

0.16164

0.57081

0.16786

0.59078

0.17595

0.61074

0.23824

0.63070

0.24757

0.65066

0.25997

0.67062

0.26719

0.69059

0.28648

0.71055

0.33554

0.73051

0.34143

0.75047

0.39633

0.77044

0.40433

0.79040

0.43222

0.81036

0.44594

0.83032

0.78762

0.85028

0.80317

0.87025

0.81134

0.89021

0.86091

0.91017

0.87446

0.93013

0.93026

0.95009

0.94315

0.97006

0.94470

0.99002

1.00000

Source: Author’s calculations.