Small area estimation with zero-inflated data - a simulation study

Many target variables in official statistics follow a semi continuous distribution with a mixture of zeros and continuously distributed positive values, also called zero-inflated variables. When reliable estimates for subpopulations with small samples are required, a model-based small area estimation method can be used, which improves the accuracy of the estimates by borrowing information from other domains. In this paper, two small area estimators are compared in a simulation study with zero-inflated target variables. The first estimator, the EBLUP, can be considered as the standard small area estimator, and is based on a linear model that assumes normal distributions. Therefore it is model-misspecified in our situation. The second estimator is based on a model that takes the zero-inflation into account and is therefore less misspecified. Both estimators are found to improve the accuracy compared to a design-based approach. The gain in accuracy is generally larger for the model that takes the zero-inflation into account. The amount of improvement depends on properties of the population. Furthermore, there are large differences in improvement between the domains.