Manual retail price observations discontinued

/ Author: Miriam van der Sangen/Masja de Ree
Cash register at an Albert Heijn supermarket in Zoetermeer
© Hollandse Hoogte / Frank de Roo
Over the past century, around one hundred interviewers would head out each month on behalf of Statistics Netherlands (CBS) to record the prices of products at 7,000 different shops around the country. These activities have been phased out in recent years. As of 1 January 2020, manual retail price observations are now a thing of the past. CBS now collects price data in more innovative ways by using scanner data and web scrapers. This has made CBS the first statistical agency in the euro area to reach this milestone.

Every month, CBS collects information on the prices of a large number of products and services. These data are used to calculate the consumer price index (CPI), an important indicator of consumer price inflation. For many years, the source data were collected from shops around the country by CBS interviewers. They would enter the shops with a ‘shopping list’ of representative products to check and record the prices there and then. This was a labour-intensive process. However, it was gradually scaled back over the past few years. Jos van Linden is familiar with the ins and outs of this process. She has worked as an interviewer for CBS for the past 20 years and visited a number of stores each month. ‘I’ve always enjoyed this work. In some of the stores, I was even a regular visitor for ten years on end. It allowed me to build a close relationship with the shopkeepers.’

Pleasant job

What did the interviewer’s job entail? ‘I’d report to the shopkeeper and then make my round through the shop. Usually, I would record about 20 to 25 different products. We always tried to make sure our work gave the smallest possible disruption to shopkeepers. For example, the customer would always go first and we’d pick the most convenient times to visit shops, for instance on a Wednesday or Thursday morning.’ Van Linden did notice how her work changed over the past few years. ‘We knew that more and more shops no longer required visiting because CBS started receiving the scanner data directly from those shops. For the shopkeepers, this was a welcome development. As for me, I find it a pity. To us interviewers, it was always a pleasant job and the response was guaranteed.’

Major step

On 13 December 2019, CBS interviewers made their rounds recording retail prices for the Dutch CPI for the very last time. Discontinuing store price observations signifies a major step for CBS. The fact that prices are now collected via scanner data and web scrapers ensures an even higher quality of the consumer price index as well as cost savings. In addition, the shopkeeper is no longer inconvenienced. Van Linden: ‘We will however continue the regular visits in April and November to shops in a number of large cities including clothes shops and furniture shops, supermarkets as well as restaurants and cafés, in order to compile price level indices at the European level (an obligation imposed by the EU’s statistical authority, ed.). But there as well, the number of price observations is decreasing. Fortunately, CBS still has sufficient other surveys which need to be carried out and which require face-to-face interviews. For example, surveys on health perceptions, changes taking place in society, lifestyle surveys, etc.’

Scanner data and web scrapers

Koen Link, statistical researcher at CBS, explains how CBS is now able to collect all necessary information on retail prices in an efficient manner using scanner data and web scraping methods: Link: ‘In 2003, CBS put into use supermarket scanner data for the first time. These are cash register data including the number of items sold and the turnover per product. This is how we determine the price per barcode. The use of these scanner data has taken off in recent years. In addition, CBS uses so-called web scrapers and robot tools. We let computer programs retrieve the information we need from the websites of clothes shops. The same is done for shoe shops and furniture shops as of 1 January 2020. As for barbershops, we use a robot tool that sends a signal once prices on the website are changed. Smaller shops nowadays have their own websites as well. Then, if one or two shops are missing in our data and we really need these data, we will send them a questionnaire.’

New statistical methods

For proper processing of the scanner and online data, CBS has developed special statistical methods and new practices. The use of web scrapers provides datasets of an enormous size, for instance. Proper classification of those data is important. Link explains: ‘In clothing, we need to make a distinction between dresses, skirts, short sleeves, long sleeves, etc. CBS uses name filters and checks the names of product groups instead of individual products. This enables us to categorise the products in a better way. The advantage of this new practice is that we arrive at a much more accurate figure, as we now incorporate all the data rather than a mere selection of items that we placed in the interviewers’ shopping basket by way of sampling.’

Ongoing development and innovation

By 2019, observations were already limited to 420 shops per month. These were mainly shoe shops, furniture shops and smaller specialist shops. That is all over now as well. Link says: ‘As we are expanding the use of scanner data and web scrapers, this type of store observation is not necessary anymore. We are the first statistical office within the euro area to discontinue store price observations altogether. That’s a milestone. Eventually, we aim to implement machine learning techniques to classify products into specific categories. We are working on that now. It’s a process of ongoing development and innovation.’