Is it MAR or NMAR?

M. Sverchkov

Abstract

Most methods that deal with the estimation of response probabilities assume either explicitly or implicitly that the missing data are ‘missing at random’ (MAR). However, in many practical situations this assumption is not valid, since the probability of responding often depends on the outcome value or on latent variables related to the outcome. The case where the missing data are not MAR (NMAR) can be treated by postulating a parametric model for the distribution of the outcomes under full response and a model for the response probabilities. The two models define a parametric model for the joint distribution of the outcome and the response indicator, and therefore the parameters of this model can be estimated by maximization of the likelihood corresponding to this distribution. Modeling the distribution of the outcomes under full response, however, can be problematic since no data are available from this distribution. Sverchkov (2008) proposed a new approach that permits estimating the parameters of the model for the response probabilities without modelling the distribution of the outcomes under full response. The approach utilizes relationships between the population, the sample and the sample-complement distribution derived in Pfeffermann and Sverchkov (1999) and Sverchkov and Pfeffermann (2004). The present paper investigates how this approach can be used for testing whether response is MAR or NMAR.