European study: uniform data validation rules

08/02/2018 15:50 / Author: Masja de Ree / Photography: Miriam van der Sangen / Category: International developments
It has been nicknamed ‘data pingpong’: the mutual exchange of data among organisations, e.g. between the EU’s national statistical institutes and EU statistical agency Eurostat. The ESSnet on data validation is a European research programme aimed at improving data exchange by agreeing on a uniform set of validation rules beforehand. This topic featured prominently at a conference hosted by Statistic Netherlands (CBS) in The Hague last 11 to 12 January, entitled ‘ESSnet: Validat Integration’.

Data validation

Eurostat produces European statistics based on data which are provided by the individual member states. ‘All these countries have different data collection methods,’ says Olav ten Bosch (CBS). ‘Differences in definitions and interpretation may result in erroneous data. This means Eurostat cannot simply combine and use all the data they receive.’ Validation of the datasets by adjusting for such differences is very time-consuming when it is done manually. That is why European countries have joined forces to arrive at a uniform set of statistical rules for validation, to be applied in each individual country before they submit their data to Eurostat. If this is successful, possible errors in data will be detected and adjusted at an earlier stage.

Testing scenarios

Altogether six countries take part in this research programme: Germany, Lithuania, Poland, Sweden, Portugal and the Netherlands. Ten Bosch: ‘We build on the results of the first ESSnet in 2015, when we decided on the methodology and reviewed the language in which to draw up the rules. This time we are taking it one step further as we are testing scenarios in different countries in which we apply the new methodology and rules accordingly.’ One successful element in the research programme is the generic validation report developed by CBS: ‘We would like to see a generic format for all reporting of data assessments. The good thing about this report is that it is machine readable. It allows computers to process the results contained in the report immediately and automatically push it towards data adjustment procedures. Both the Netherlands and Poland are already using the report; the further rollout is in the hands of Eurostat.’

Regional conferences

The ESSnet is organising four different conferences, each in a different location. The second conference took place last 11-12 January in The Hague at CBS. Ten Bosch: ‘At each conference, we present the results of our research work. For the participants, it is also an opportunity to discuss application of the results in their own region.’ The conference In The Hague was attended by 30 participants from 15 countries. Ten Bosch: ‘It produced a wide range of new ideas, for instance on ways to review the validation rules over time. It is important to realise that for one system to be implemented throughout the whole community, you need to know how things are done in each individual country. This conference contributed to mutual understanding in this respect.’ The ESSnet on data validation will end on 1 March. Ten Bosch: ‘We have taken another step forward towards implementation of a uniform set of rules. The building blocks for this are being created step by step and implemented within each individual statistical domain. In the National Accounts, for example, we have come a long way.’

‘The infrastructure needed to implement the validation rules in all EU member countries is becoming more and more solid’

Ambitious project plan

Leading the ESSnet project on data validation is Volker Weichert of Germany’s federal statistical office Destatis. He explains the way countries are collaborating in this project. ‘This research project has been built upon the joint capabilities of all participating countries. All those involved contributed essential skills and resources. We have a very ambitious project plan. The specialists from the six participating countries and from Eurostat not only shared their own professional expertise, but also their own unique perspectives based on personal experiences with statistical production in their own country. We can see the dedication of all those involved in the project in the results and in the 1,500 email messages which I received over the past twelve months!’ What will be the next step? ‘The European Commission has awarded subsidies to a number of countries for implementation of the new validation rules in their production process. At the same time, Eurostat is working on completion of several services which are going to facilitate the introduction of these validation rules. The infrastructure needed to implement the validation rules in all EU member countries is becoming more and more solid.’

Cost-benefit model

Sónia Quaresma of the Portuguese statistical institute was there to assist in organising the conference in The Hague. She looks back with satisfaction: ‘I feel that we have achieved a great deal and that we were successful in establishing a proper framework for statistical validation.’ As part of the ESSnet, Quaresma has drafted a cost-benefit model that enables comparison of the different scenarios for implementation of the new rules. Quaresma: ‘We have designed various tools and strategies. The NSIs should have a proper description and assessment of the various options in order to make an informed choice. This cost-benefit analysis helps you in determining the right choice given your own situation. The scenarios for data validation will need to develop organically over the next few years as more and more NSIs start their own implementations.’