Data scouting in coronavirus times

/ Author: Redactie
Florian Henning of CBS meeting with Michel van Bommel of the Dutch Payments Association via Zoom
© Sjoerd van der Hucht
The data landscape is changing. Data are being collected and stored in more and more places, such as traffic loops, retail scanners and payment systems. That means more data are available, in some cases faster than was previously possible, increasing the potential to improve existing statistics and create new statistics. CBS makes maximum use of these secondary data sources. They also help in the compilation of high-quality data during the coronavirus crisis, which is important in maintaining up-to-date knowledge. The Dutch Payments Association is one of the cooperation partners that supply such data to CBS.

Data Officer

CBS is engaged in various initiatives to gain access to secondary data sources and use them to produce statistics. These data scouting activities not only increase the potential and quality of the statistics, but also deliver on the Van Straalen Committee’s recommendation to ease the burden placed on companies by the government. Florian Henning, Data Officer at CBS, coordinates these activities. ‘The use of secondary data sources involves a lot of work, such as legal, technical and methodological aspects. When is it possible to use external data, and when is it not? How can privacy be guaranteed? How do we integrate these data into our IT systems? What methodological approach is required to produce high-quality figures with these data? These factors are relevant not only within CBS but also in the relationship management with our cooperation partners. As CBS is making increasing use of data scouting, a single central knowledge and coordination point has been set up.’

Social importance

External source data owners increasingly appreciate the social importance of sharing data with CBS, Henning explains. ‘Source data owners work with us because they want to contribute to society and run their business responsibly. But the cooperation often leads to CBS investigating new methodologies, and that in turn helps source data owners to develop their knowledge. In some cases they also work directly with CBS in consortia on innovation projects.’ CBS is making extensive use of secondary data sources during the present coronavirus crisis. ‘Many coronavirus-related figures are based on a combination of CBS data and external data sources, for example from administrative processes,’ Henning explains. That means new figures can be produced rapidly to support policymaking. For example, local authorities send us data on applications for financial support under the Temporary Bridging Measure for Self-employed Professionals (Tozo). And various operators such as Schiphol Airport and ProRail are supplying weekly data for fast indicators of transport movements and goods shipments during the coronavirus period. Consumption figures can now also be produced rapidly using scanner data from supermarkets, DIY stores and other retail outlets. They can be used, for example, to investigate the phenomenon of hoarding.

CBS can use secondary data sources as a rapid means of producing high-quality, detailed, up-to-date statistics. The government uses these statistics, for example, as a basis for policymaking. CBS has substantially eased the burden on business in the past by making increasing use of existing data sources. That also means fewer questionnaires have had to be completed by individuals and households. CBS has therefore been given legal authorisation to collect and process these data for the production of statistics.

Privacy is guaranteed throughout this process. We only publish statistical information if natural persons and companies are not recognisable or traceable. We also have measures in place to prevent the theft, loss or misuse of personal data. CBS never supplies recognisable data to third parties, not even to other government bodies.

Dutch Payments Association

The Dutch Payments Association, the industry association for regulated payment services in the Netherlands, has been supplying weekly data to CBS since the start of the coronavirus crisis, on numbers or amounts of transactions at point-of-sale terminals, online payments using iDEAL and cash withdrawals from cash machines. These data are aggregated and anonymised. CBS can use them as a rapid means of providing figures on consumption and expenditure. Michel van Bommel, Senior Policy Advisor Strategy & External Affairs at the Dutch Payments Association, said: ‘Since 2018 we and CBS have been discussing ways of assisting each other with the collection and reporting of data on domestic and cross-border payments. CBS, the Dutch Banking Association and the Dutch Payments Association look for good data on business payments between the Netherlands and other countries, including those outside the EU. For example, these could include analyses of interregional transactions by sector, such as trade in bulbs with German companies. The cooperation with CBS has already resulted in the Dutch Payments Association supplying weekly figures to CBS on payments during the coronavirus crisis. Conversely, we supplement our own figures with insights from CBS on matters such as the use of the Internet and online banking and payment services, as well as visits to online retailers.’

Supplementary data

CBS uses secondary data for various purposes. ‘The volume of data collected opens up many possibilities,’ says Henning. Secondary data can increasingly be used in the production of statistics. They are obtained from registers as well as other sources. We use scanner data to record consumer prices and information from traffic loops to record traffic volume. CBS is constantly investigating new possibilities, such as the use of image recognition to map solar panels or the use of floating car data based on GPS.’ Do these developments spell the end of traditional data collection methods? ‘No,’ says Henning, ‘primary observation is still very important. Questionnaires will always be required in order to collect certain data. Secondary sources are complementary and can be used to broaden, accelerate and improve the quality of statistics, and as a means of validating previously used sources or adding more detail, for example with geographic or time data. But each type of data makes its own contribution: sometimes secondary data are sufficient, whereas in other cases a complete picture can only be obtained with primary data, so both sources are required.’

Data scouting community

Henning acts as a facilitator and advisor on data scouting projects, working together with a multidisciplinary data scouting community. ‘The community includes people with a range of specialisations, such as lawyers, account managers, statistical researchers, project managers, technical experts and methodologists, all of whom have extensive experience of data scouting. Its focus is on further harmonisation of the data scouting process. Data scouting is new, so we’re continuing to develop our knowledge of it. To provide optimum support for new projects, we’ve drawn up process descriptions, manuals and recommendations. But data scouting is closely bound up with relationship management, so often requires a customised approach rather than a rigid process. Our role is to provide support and coordination rather than to make policy. We’ve compiled a catalogue listing all secondary data sources and suppliers we work with, and a data scouting monitor, which lists all ongoing projects. That means we can maintain a comprehensive picture and assess our data gap, in other words what data we have or don’t have, and what the priority is for further cooperation with source data owners.’