The contribution of Statistics Netherlands to the project was first of all a model-wise description of the structure and activities of a business. The description had to be rich enough to enable formation and classification of statistical units on model-based information and also enable production of alternative activity classifications. Next a questionnaire was developed to collect the necessary model information from businesses. Furthermore algorithms were developed to construct and classify statistical units with model information as input. The following activities were carried out from 2000 to 2002:
- participation in a EU-wide User Needs survey in which explored who are users of activity classifications and what are their needs (User needs);
- further expanding the previously developed model for description of the structure and activities of businesses (Foundations);
- developing and testing of an electronic questionnaire to collect the necessary information from businesses (Tools and systems);
- developing algorithms to construct and classify statistical units on the basis of information in the model, to update present activity classifications and to generate new ones (Foundations).
Information on the CLAMOUR-project in general
Who were involved?
The project is part of and financed from the Fifth Framework Research Program of the European Commission. The project was carried out in collaboration between five European Union countries: Denmark, Finland, France, the Netherlands and United Kingdom.
The schedule of activity
The ideas for the project were developed during 1999 and formally accepted by the European Commission in January 2000. Work began shortly after and the project was finished in March 2002.
The work was divided into four areas of research:
- User Needs
Tools and Systems
Visit the CLAMOUR website for more information on the project and the participants. Project results have been published there too.
National statistical institutes (NSI) collect and classify data so that an accurate and up-to-the-minute international economic picture can be produced. Based upon this picture governments and research organisations can inform the wider debate and enterprises can found there policies. Such statistical information is provided as widely as possible.
It is important to establish how information will be used and what form of classification provides the best value from the information released. To gain a better understanding, each NSI participating in the project has interviewed a sample of business statistics users in their country. Questions have been asked how information is currently used and which possible future changes may alter the data to be collected and the way they are presented.
In planning for the future, care must be taken to balance the needs for a consistent measure across time with the need to keep a finger on the economic pulse. Furthermore, the burden to information suppliers – often those in the business community – should not be overlooked.
The foundational work distinguished the fundamental building blocks from which the units and activity classifications can be derived. This work was divided into three work packages:
- constructing a model of the structure and activities of businesses;
- applications of the model to activity classifications (present ones, updating, generating new ones);
- applications of the model to statistical units.
The outcome of the User Needs study have fed into the Foundational work by informing those responsible of the applicability of NACE (the European classification standard). Results should be a better statistical description of business structures and activities.
To improve the speed and the quality of determining statistical units and their activities given the information of the business the project has developed linguistic methods and tools that are able to recognise the best meaning of descriptions. Existing tools are often deficient in the recognition of text written in free language.
It should be possible to derive the relevant information for the model by the profile signature of the processes, input and output, structure etc. of a business. Within a given language and across the European community there are nuances that imply different meanings. This work has looked at how well these differences can be minimised. This will perhaps lead to interactive systems providing intelligent guidance. The linguistics work covered four work packages:
- context based semantic disambiguation;
- compound term processing;
- analysis of highly structured descriptions;
assisted matching between correlations.
Tools & Systems
The preceding work has also resulted in information for those involved in developing new statistical products and services.
One identified piece of work is the development of an electronic questionnaire that can be used to assist in the data collection from businesses in a user-friendly way. The electronic questionnaire has been tested extensively in at least Finland, the Netherlands and United Kingdom.