Centraal Bureau voor de Statistiek, Ga naar hoofdmenu / zoekveld.

HomeMethodenOnderzoek naar methodenDiscussion papers2011 > Overzicht

2011

21-12-2011 Various claims have been made regarding the benefits that EnterpriseArchitecture (EA) delivers for both individual systems development projects and the organization as a whole. This paper presents the statistical findings of a survey study(n=293) carried out to empirically test these claims. First, we investigated which techniques are used in practice to stimulate conformance to EA. Secondly, we studied which benefits are actually gained. Thirdly, we verified whether EA creators (e.g. enterprise architects) and EA users (e.g. project members) differ in their perceptions regarding EA. Finally, we investigated which of the applied techniques most effectively increase project conformance to and effectiveness of EA. A multivariate regression analysis demonstrates that three techniques have a major impact on conformance: carrying out compliance assessments, management propagation of EA and providing assistance to projects. Although project conformance plays a central role in reaping various benefits at both the organizational and the project level, it is shown that a number of important benefits have not yet been fully achieved.

21-12-2011 This paper gives alternative derivations for the standard variance formulas in two-stage sampling. The derivations are based on a direct use of the statistical properties of the sampling errors in the second stage. For the ease of exposition we examine the specific case that simple random sampling is used in both stages. These derivations might be useful for readers looking for more elementary approaches to two-stage sampling.

21-12-2011 This paper presents and discusses some new results on the second-order inclusion probabilities of a systematic probability proportional to size sample drawn from a randomly ordered list, also called randomized PPS sampling. It is shown that some standard approximations of these second-order inclusion probabilities meant for relatively small sample sizes, need not be valid when the sample size n is of the same order as the population size N. In addition, it is shown that under a number of assumptions the variance formulas for rejective Poisson sampling can be applied to randomized PPS sampling designs when both n and N-n are large.

21-12-2011 Numerical and categorical data used for statistical analyses is often plagued with missing values and inconsistencies. In many cases, a number of missing values may be derived, based on the consistency rules imposed on the data and the observed values in a record. The methods used for such derivations are called deductive imputation. In this paper, we describe the newly developed deductive imputation functionality of R package deducorrect.

20-12-2011 Analyses of categorical data are often hindered by the occurrence of inconsistent or incomplete raw data. Although R has many features for analyzing categorical data, the functionality for error localization and error correction are currently limited. The editrules package is designed to o er a user-friendly toolbox for edit de nition, manipulation, and error localization based on the generalized paradigm of Fellegi and Holt.

03-11-2011 This report describes the results of the analyses of the Personal Wellbeing Index (PWI) for the Netherlands. We show that the original scale has sufficient construct validity. Six of the eight distinguished life domains contribute significantly to the explained variance of the overall quality of life. Further, PWI indicators are related to other items and alternative dependent variables as expected from theory. Lastly, the eight life domains can be aggregated into one PWI scale with high constructvalidity. We conclude that the PWI can be used as a quality of life measure instead of a single life satisfaction indicator in order to reveal the multi-dimensionality ofquality of life.

19-10-2011 The Dutch Labor Force Survey (LFS) is based on a rotating panel design. Recently an estimation procedure that is based on a multivariate structural time series model has been adopted to produce monthly official statistics about the labor force. This approach handles problems with rotation group bias and small sample sizes in an effective way and enables Statistics Netherlands to produce timely and accurate estimates about the labor market. In this paper the time series model is extended by incorporating an auxiliary series about people registered as unemployed in the register of the Office for Employment and Income. The information of the auxiliary series is used to improve the precision of the monthly unemployment figures by modelling the correlation between the trends of the LFS series and the auxiliary series of the registered unemployed labor force. It appears that the trend of the series of the registered unemployed labor force is cointegrated or almost cointegrated with the trend of the estimated unemployed labor force of the LFS for several domains. This results in a considerable decrease of the standard errors for the monthly unemployed labor force.

28-09-2011 This paper describes how the bootstrap resampling method may be used to assess the accuracy of estimates based on a combination of data from registers and sample surveys. We consider three different estimators that may be applied in this context. The validity of the proposed bootstrap method is tested in a simulation study with realistic data fromthe Dutch Educational Attainment File.

06-09-2011 This paper is the first of two papers describing the editrules package. The current paper is concerned with the treatment of numerical data under linear constraints, while the accompanying paper (Van der Loo and De Jonge, 2011) is concerned with constrained categorical and mixed data. The editrules package is designed to offer user-friendly interface for edit de_nition, manipulation and checking. The package offers functionality for error localization based on the paradigm of Fellegi and Holt and a flexible interface to binary programming based on the choice point paradigm. Lower-level functions include echelon transformation of linear systems, variable substitution and a fast Fourier-Motzkin elimination routine. We describe theory, implementation and give examples of package usage.

24-08-2011 Mixed-mode surveys are susceptible to mode-dependent selection effects and measurement errors, collectively known as mode effects. In sequential mixed-mode surveys, where non-respondents in one mode are re-approached using a different mode, it is likely that the mode composition of the response differs between subpopulations or between subsequent editions of the survey. Such variations in the mode composition lead to variations in the measurement errors, invalidating classical inference. An approach to inference in these circumstances is proposed, by calibrating the mode composition of the response to fixed levels. Assumptions and risks associated with such a procedure are discussed. The case of the Dutch Crime Survey is discussed as an example.

29-06-2011 Since raw (survey) data usually has to be edited before statistical analysis can take place, the availability of data cleaning algorithms is important to many statisticians. In this paper the implementation of three data correction methods in R are described. The methods of this package can be used to correct numerical data under linear restrictions for typing errors, rounding errors, sign errors and value interchanges. The algorithms, based on earlier work of Scholtus, are described and implementation details with coded examples aregiven. Although the algorithms have originally been developed with financial balance accounts in mind the algorithms are formulated generically and can be applied in a wider range of applications.

09-06-2011 This paper shows that 19 relevant attributes of quality reports can be distinguished. These attributes are useful if we want to systematically manage the quality of quality reports and were established through analysis of documents about quality reporting and the minutes of the SQ-ESAC workshop about quality reporting. Each attribute is defined, but according to the Object-oriented Quality Management model more steps can be taken. Requirements can be formulated for each attribute and causes and effects of problems can be analyzed. Based on these requirements and risk analysis, measures can be taken to assure the quality of quality reports.

07-06-2011 In most surveys all sample units receive the same treatment and the same design features apply to all selected people and households. In this paper, it is explained how survey designs may be tailored to optimize response rates and to reduce nonresponse selectivity. Such designs are called adaptive survey designs. The basic ingredients of such designs are introduced and discussed and illustrated with a number of examples including a pilot study.

05-04-2011 Inventories play a crucial role in explaining business cycle turning points. Inventories contributed 0.7 percentage point to a 4 percent contraction of economic activity in 2009. In light of this, demand for inventory data has been growing since the financial crisis hit in late 2008. This paper analyses Dutch wholesale and manufacturing inventories and relates them to the business cycle.

01-04-2011 Inventories are a useful statistic for tracking and analysing short-term economic developments. This paper by Floris van Ruth and Marcel van Velzen describes how the index of inventories of finished goods in the manufacturing industry can be used in business cycle analysis. Inventories themselves lag business cycle developments, and are therefore of limited use. Using the turnover index of the manufacturing industry to compute a ratio of inventory to sales (ISR) produces a new and leading business cycle indicator. The ISR is shown to consistently lead Dutch business cycle developments by one to two quarters. It is therefore one of the few real, i.e. non-financial and non-sentiment, leading indicators. Inventories are shown to exhibit clear co-movement with sales and the business cycle. The countercyclical development of the ISR is therefore explained by the fact that turnover reacts more strongly to business cycle developments than inventories.

25-03-2011 In this paper, we describe the controversy that arose between Statistics Netherlands and the Ministry of Economic Affairs in 2009 after the Dutch government announced a tax relief measure for businesses, which deteriorated the quality of tax data used by Statistics Netherlands for producing short-term statistics.

23-03-2011 Final report on Grant agreement no.  50303 2008 003 2008 352 In this discussion paper methods are presented to compile water abstraction and water use data at the level of River Basins in the Netherlands, for the years 2004-2008. In general, the methods build upon existing national data on water abstraction and drinking water use.

23-03-2011 Organizational change (OC) is an important complementarity factor in the process of creating business value from information technology (IT) investments.. This paper investigates complementarities between IT capital and OC initiatives of the firm. It analyzes the productivity impact of different clusters of IT and OC in the manufacturing and services sectors of the economy. Three dimensions of OC are studied: process, structure, and boundary changes. Two distinct econometric approaches are applied to a unique and detailed sample of 32,619 firm-level observations in the Netherlands for the period 1994-2006. The results reveal that the productivity effect of IT significantly increases when technology investments are accompanied by relevant organizational changes. The observed complementarity effects between IT and OC are stronger for services than for manufacturing firms. The effects become stronger if different types of change are combined with each other and form clusters.

10-03-2011 It is not only important to produce good quality statistics, but also that the users of these statistics believe that they are of good quality. Therefore, it is necessary that the subsequent provisional estimates of the national accounts show a similar picture of economic performance. I.e. the subsequent estimates have to be sufficiently reliable.In this paper is analysed to what extent this requirement is met, taking into account that this requirement seriously conflicts with timeliness. A computer program was developed to enable a quick reliability-check to a large number of economic variables, derived from the national accounts.

01-03-2011 Apart from the traditional sources used by National Statistical Institutes, like sample surveys and administrative sources, nowadays more and more electronic sources of information are available that potentially can be used for the production of statistics. In the paper four sources are studied: i) Product prices on the internet, ii) Mobile phone location data, iii) Twitter text messages, and iv) Global Positioning System (GPS) data and traffic loop information. For each data source an overview is given of the usability of the collected information, as well as the practical and methodological challenges that lay ahead.

24-02-2011 In surveys persons have a tendency to round their answers. For example, in the Labour Force Survey people are asked about the period they have been unemployed. There is clearly a tendency to give answers that are rounded to years of half years. Because of this rounding statistics based on this data tend to be biased. In this paper we introduce a method with which the rounding mechanism is modelled together with the ‘true’ underlying distribution. These are then used to select samples which are likely to be rounded an impute new values for these. This method is applied to the Labour Force Survey data. An investigation of robustness shows that the method is robust against misspecification of the model of the underlying distribution and to misspecification of the rounding mechanism.

17-02-2011 Statistics Netherlands has started a process to review the statistical priorities. The demands of society change, but budget restrictions and the desire to reduce administrative burden do not allow increase of staff or surveys. Therefore, negative priorities are needed. A working group, chaired by the chief statistical officer and with members of the statistics divisions, was asked to assess proposals that were put forward by the statistics divisions. In order to try and objectify the comparison of the proposals, an assessment model was developed in cooperation with external consultancy. This paper describes the approach of the process.

03-02-2011 Monthly Short Term Business statistics at Statistics Netherlands can be based on survey data, VAT records or a combination of these two data sources. Both sources are incomplete when statistics need to be produced. The survey response rate increases gradually in time and is still far from 100% after a month of data collection. The VAT register also fills gradually in time because i) quite some enterprises report on a quarterly or annual basis, and iii) those that report on a monthly basis report unevenly spread over time.In this paper we investigate and compare the representativity of survey and VAT response as a function of time. The objective is to determine whether VAT is as representative as survey data and can be used to produce accurate statistics. For this purpose we use so-called Representativity (R)-indicators and partial R-indicators. The results can be used in designing data collection for monthly statistics and in assessing the timing of processing survey and register data.

03-02-2011 In the European Union tens of billions of euros are spent on regional policy every year. A major part of this amount is allocated on the basis of regional gross domestic product per capita. In this paper by Henk Nijmeijer an inventory is drawn up of recent work on quality of regional accounts estimates. Special attention is paid to the instrument of process tables.The regional accounts should be compiled in close cooperation with the national accounts. The quality of the national accounts estimates could be improved by the findings of the regional accounts compiling process.

28-01-2011 Methods are considered to calculate a set of consistent price index numbers from an inconsistent set of chained index numbers. The inconsistencies are due to the existence of cycles in the price index graph cycles.  The initial index numbers are calculated using an index formula, to the user’s choice. It is only required to satisfy a few simple consistency conditions. This does not include transitivity. One method (due to Hill) uses spanning trees to solve (or rather: sidestep) the inconsistency problem. The second method seeks to adjust the initial values in such a way that the new index numbers satisfy a transitivity criterion, and are close to the original index numbers. The approach in the present paper is inspired by levelling in land surveying.

28-01-2011 Methods are considered to calculate a set of consistent price index numbers from an inconsistent set of chained index numbers. The inconsistencies are due to the existence of cycles in the price index graph cycles.  The initial index numbers are calculated using an index formula, to the user’s choice. It is only required to satisfy a few simple consistency conditions. This does not include transitivity. One method (due to Hill) uses spanning trees to solve (or rather: sidestep) the inconsistency problem. The second method seeks to adjust the initial values in such a way that the new index numbers satisfy a transitivity criterion, and are close to the original index numbers. The approach in the present paper is inspired by levelling in land surveying.

28-01-2011 We investigate the relationship between nonresponse error and measurement error as a function of a number of survey design features. Both types of survey error are quantified using indicators. Nonresponse error is analysed in terms of maximal nonresponse bias. Measurement error is decomposed into measurement profile risk and response bias. A measurement profile is a certain response style or behaviour.

14-01-2011 Within the ongoing redesign program of social surveys at Statistics Netherlands a small area estimation method for labour status has been developed. The model used is the basic unit-level model, which is a linear mixed model with random area effects, where the areas are municipalities. We discuss several issues concerning model choice, including the use of linear (mixed) models for binary variables, the use of posterior means instead of maximum likelihood estimates to prevent zero or too small estimates of between area variance and the use of covariates at both the unit and area level. Several model selection measures and graphical diagnostics have been applied to arrive at a set of covariates used in the model. We focus on the estimation of municipal unemployment fractions, but also discuss estimation of fractions employed and not belonging to the labour force. The municipal estimates are benchmarked such that they are consistent with regularly produced provincial estimates. The small area estimates thus obtained have smaller estimated mean squared errors than the current estimates based on the generalized regression estimator, and display a much more plausible development over time.

11-01-2011 This paper by Marcel van Velzen and Leendert Hoven describes the method used in the compilation  of the monthly volume index of inventories of finished goods for the Dutch manufacturing industry. The index was introduced at the end of 2009. In the paper, the plausibility of the outcomes is assessed. The paper also addresses the potential use of the index in the compilation of production indices.