2.1. Innovation in general at CBS
Statistics Netherlands supports decision-making by providing the public and private sector with reliable, transparent and coherent statistics of undisputed quality. These statistics are also used in scientific research. The information (official statistics) published by Statistics Netherlands covers topics that are relevant to society and government such as economic activity and consumer confidence, safety, health and leisure. Within the Dutch public sector, Statistics Netherlands is the data expert, having amassed 120 years of experience. Its employees have a wide expertise of data sources, ranging from survey and administrative data sources to alternative, (big) data for quite some time. Individual Big Data sources usually only tell part of the story, are often owned by private parties and their quality varies widely. There is a need for an independent and trustworthy party who combines these big data sources with other sources such as survey and administrative sources, sets quality standards and adheres to transparency, while safeguarding privacy of individual citizens and companies at all time. To create and maintain this trust as an NSI, simply complying with privacy regulations is not enough. This must be accompanied by open and transparant communication and a diaogue with privacy organizations and society as a whole.
Statistics Netherlands has a wide range of innovation activities at input, throughput and output level from the improvement of (big) data access with data holders, automatizing and digitalization of processing and tooling, new data infrastructure, a data strategy, an agile way of working to improving access to systems to make processes more efficient and output more relevant for policy makers. Statistics Netherlands has also set up an Observation Innovation Network where experiments are being done with Apps, sensor data coupled to surveys (smart surveys) and finally several initiatives to make maximal use of administrative data sources.
2.2. Innovation and Big Data at CBS
The launch of the Centre for Big Data Statistics (CBDS) in September 2016 was one of Statistics Netherlands’ innovation initiatives in order to support evidence-based policy with new, detailed or real-time information. The CBDS offers opportunities for developing data science methods and techniques through partnerships with the academic world and the exploitation of new data sources (data scouting). Knowledge, infrastructure and data are brought together by Statistics Netherlands and its partners in order to meet current information needs of society. The CBDS works on socially relevant themes such as economic growth, the energy transition, mobility, the labour market, health, the housing market and safety and cross-border statistics. The innovation consists of the development of experimental statistics through product development: identifying data sources and new methods and techniques to improve existing official statistics or develop new statistics in order to address policy questions in a more timely or detailed manner. These activities lead to the publication on the innovation website of beta products and working papers on the methodologies used.
One such example of a new experimental statistic is an improved model to determine the solar energy yield from photovoltaic (PV) systems on a regional (municipality) and daily basis (Laevens et all, 2020). Currently Statistics Netherlands produces yearly, national estimates using a register containing most PV systems in the Netherlands. A growing need for more high resolution data led to an improved method where new, alternative data sources were identified such as high resolution data from satellite images in the form of solar irradiance data and yield data from PV systems, available on an online portal. The combination of these data into a new model, led to new insights in the production of solar energy on the local level. This is useful for local authorities so they can better understand the amounts of energy that are generated in their municipality.
Another example is detecting small innovative companies using text data from their websites (Daas P. and van der Doef S., 2020). Statistics Netherlands sends out surveys to collect information on innovation in companies but these do not include the smaller innovative companies. With this new approach Statistics Netherlands and other NSIs have been able to detect innovation in smaller companies and startups.
2.3. The innovation process
The aim was to implement these experimental statistics in the official statistical output so that policy makers could make use of the validated information. As a rule, we start new innovation projects with the development of a Proof of Concept (POC) to demonstrate the capabilities of a new method or data source. Successful POCs can be further developed into experimental statistics called beta products. Beta product development looks at the stability of the data source, validates the method, and tests the requirements for further implementation. An innovation is completed when it has succeeded in converting an experimental statistic into a full-fledged one-time publication or official statistic. However, many barriers have been met that made implementation rather challenging which range from methodological to technical and cultural challenges. An overview of these can be found in (De Broe et all., 2021a). In effect, Statistics Netherlands was not able to implement these new outputs until now except for the two already existing statistics (traffic intensity using traffic loop data and Consumer Price Index using the scanner data). In order to allow a process for implementation, the division of research and development has designed an innovation pipeline model that would facilitate and coordinate the process from Proof of Concept to beta publication to official statistic. A short description of the innovation pipeline is below.
An innovation begins in the idea phase. Ideas can be intended to replace regular processes in the long run, but also create new products/statistics. Important criteria for deciding whether an exploration is worthwhile are of course whether there are sponsors (but is not a precondition), ideas fit within the Statistics Netherlands Act (Article 3), that there is adequate staffing from within the different statistical divisions and that the new output addresses an information need among users and policy makers. In the exploration phase possibilities of obtaining grants for the ideas (POC) and possibilities for new output with the data and/or the methods are investigated. The end result of this phase should be a working prototype and an insight into what still needs to be done to bring the product to a final result. At the end of the exploration phase it is determined whether these preconditions have been met and whether there is sufficient reason to proceed to the product development phase.
In the product development phase the prototype is further developed, methodological issues are addressed, the quality, stability etc. of the data is examined and the results are validated. At the end of this phase there should not be any fundamental issues that stand in the way of implementation. An important part of the implementation phase is to ensure that employees are adequately trained and IT infrastructure is at hand to deal with the production of the (new) output. In order to emphasize the importance of innovation for Statistics Netherlands, CBDS was positioned as an incubator for the exploration and product development phase.
An important aspect during the entire innovation process is transparency: at any time it must be clear which users could benefit from the planned innovation. Benefits may be new information for policy makers and the population, but also efficiency gains for NSIs and Eurostat, lower burden for data suppliers and respondents by shortening or abolishing questionnaires. Especially because innovations may also fail, it is important to make the hypothetical value and business case explicit. This also requires continuous contacts between the researchers and the beneficiaries involved and to assess the viability of the innovation at regular time intervals.