An official website of the United States government
Variance estimation methodology for the Survey of Occupational Injuries and Illnesses (SOII) was updated with the release of reference year 2020 estimates to accommodate survey weighting that reflects establishment employment size changes and increased nonresponse that are encountered in the current survey environment. To evaluate the impact of the update on the published SOII measure of dispersion, SOII 2019 relative standard errors and publishable injury and illness case counts were compared to estimates created using the updated methodology. While there were minor differences in RSEs and the publishability of some case counts, the update resulted in limited impact on SOII estimates.
The application of the new variance formula does not represent a break in series. The updated estimation methodology is as follows:
The SOII sample design employs a stratified random sampling method to select establishments. The sampling strata are formed by state, ownership, size and targeted estimation industry (TEI). Within a stratum, a simple random sample is drawn. An unbiased variance estimate of a random sample with weights within a SOII stratum is calculated as follows 1:
In the context of survey sample variance estimation, the finite population correction (FPC) factor is often incorporated into variance estimation, so that the estimated variance is only applied to the portion of the population that is not in the sample. Thus, the variance for the weighted sample mean becomes
The FPC factor is 1-ƒ and the sample fraction 2
The variance for the weighted sample total (ŷ = Nȳ*) is
The variance of the rate/ratio involving two survey variables (x,y), x̂⁄ŷ, following Taylor Series linearization method, is calculated approximately as follows 3:
x̂ is the total weighted estimate of the variable (case counts for each case type):
ŷ is the total weighted estimate of the variable (hours):
var(x̂) is the variance estimate of the variable (total case counts for each case type (x̂))
var(ŷ) is the variance estimate of the variable (total hours (ŷ))
cov(x̂,ŷ) is the covariance estimate between two variables (case counts of each case type and hours)
2. Cochran, W. G. (1977). Sampling techniques (3rd ed.). Wiley. p.24
3. Lohr, S. L. (2010). Sampling: Design and Analysis, Second edition. Brooks/Cole, Cengage Learning.
Last Modified Date: December 16, 2021