Estimating Classification Error under Edit Restrictions in Combined Survey-Register Data

22/09/2016 10:15

This discussion paper describes a method based on latent class modelling that estimates the number of classification errors in the multiple sources, and simultaneously takes impossible combinations with other variables into account.

Both registers and surveys can contain classification errors. These errors can be estimated by making use of information that is obtained when making use of a combined dataset. We propose a new method based on latent class modelling that estimates the number of classification errors in the multiple sources, and simultaneously takes impossible combinations with other variables into account.
Furthermore, we use the latent class model to multiply impute a new variable, which enhances the quality of statistics based on the combined dataset. The performance of this method is investigated by a simulation study, which shows that whether the method can be applied depends on the entropy R2 of the LC model and the type of analysis a researcher is planning to do. Furthermore, the method is applied to a combined dataset from Statistics Netherlands.

Downloads

PDF - Estimating Classification Error under Edit Restrictions