Calibrated hot-deck donor imputation subject to edit restrictions

12-8-2010 15:00

A major problem that has to be faced by basically all institutes that collect statistical data on persons or enterprises is that data may be missing in the observed data sets. The most common solution to handle missing data is imputation. At national statistical institutes and other statistical institutes, the imputation problem is further complicated owing to the existence of constraints in the form of edit restrictions that have to be satisfied by the data. Examples of such edit restrictions are that someone who is less than 16 years old cannot be married in the Netherlands, and that someone whose marital status is unmarried cannot be the spouse of the head of household. Records that do not satisfy these edits are inconsistent, and are hence considered incorrect. Another additional problem for categorical data is that the frequencies of certain categories are sometimes known from other sources or have already been estimated. In this paper we develop imputation methods for categorical data that take these edits and known frequencies into account while imputing a record.

Downloads

PDF - 2010-16-x10-pub