The use of advanced transportation monitoring data for official statistics

Cover Ma, Y. (2016). The use of advanced transportation monitoring data for official statistics. Dissertation, Erasmus University Rotterdam, handle:1765/80174.
© CBS
Dissertation on the use of electronically collected transport data for official statistics on transportation demand and flows.

The contributions of this dissertation are as follows. First, this dissertation considers fusing different data sources. Flow observations are distinct as two types. One is the link flow observed from loop detectors and Weigh-in-Motion, for instance; the other is the path flow from cameras and Bluetooth scanners which can identify vehicles. The path flow can significantly increase the accuracy of the estimated demand, since it reduces the uncertainty of the match between the flow observation and the demand.

In addition, an innovative concept of origin destination tuples is proposed and validated. Since cameras and Bluetooth scanners can identify vehicles, they bring the opportunity to get an insight into the trip chains of freight trucks. The trip chains are quantitatively represented from the demand aspect as origin destination tuples, sets of origin destination pairs. The innovative concept of origin destination tuples can be found in the survey data, but hardly from the link flow observations. The combination of survey data and the path flow demonstrates the trip chains of freight trucks.

Furthermore, the Kullback-Leibler divergence method is proposed to generalize the information minimization method and to relax the assumption from Stirling’s approximation. The hierarchical Bayesian method takes the stochastic nature of demand and flows into account. This is more realistic than the deterministic values of demand and flow.

The demand information from Statistics Netherlands is taken as the prior information with a certain distribution. Using updates from the flow observations in the road network as evidence, the posterior demand is obtained for two situations. The first situation assumes that the errors follow a normal distribution. This assumption leads to an analytical approach, which can be applied to quickly obtain the posterior demand. As the symmetric shape of the normal distribution may underrepresent the probability of a large flow, the other situation of the log-normal distribution is also discussed. In this case, an analytical approach is no longer feasible. When the Markov Chain Mont Carlo simulation is applied with Gibbs sampling, nested by Metropolis-Hastings sampling, the computations to reach an equilibrium take lots of time, which makes the method practically infeasible.

The last chapter describes how the hierarchical Bayesian network combined with a multiprocess model is applied to forecast demand. To our knowledge, there is no paper in this field published on this method. The multi-process model in the Bayesian framework has the advantage of taking the demand of several previous days into account instead of just-one-day before demand, giving each pre-defined scenario of a combination of the previous days demand a prior probability, and coming up with posterior probabilities for each scenario. Usually, one scenario, which has the shortest Euclidean distance with the true scenario, gains a unit probability, and the rest has zero probability. With this multi-process model, people’s experience is taken into account in the pre-defined scenario.

Ma, Y. (2016). The use of advanced transportation monitoring data for official statistics. Dissertation, Erasmus University Rotterdam, handle:1765/80174.