Machine learning transport mode prediction in a smartphone-based travel survey
Reliable statistics on travel behavior are important for national infrastructure planning, transport policy‐making, and understanding mobility patterns. Recent advances in smartphone‐based travel surveys enable passive data collection via smartphone Global Positioning System (GPS) sensors. These smart surveys offer an alternative to traditional travel diary surveys, which are typically impacted by a high response burden. This study examines the application of supervised machine learning models to automatically identify transport modes from GPS measurements obtained through a travel app developed by Statistics Netherlands. We compare Random Forest and Extreme Gradient Boosting classification models trained on GPS‐based features in combination with contextual location‐based features from OpenStreetMap, as well as temporal features derived from time‐related information and previous travel behavior. The Extreme Gradient Boosting model trained on the complete feature set achieved the highest accuracy (0.91) and macro‐averaged F1‐score (0.84), while also achieving the best accuracy (0.84) when evaluated on external validation data. Although these results suggest that fully automated transport mode classification in the travel app may not yet be feasible, a semi‐automated approach with targeted prompts could be used to balance transport mode classification accuracy and response burden.
Boer, Q., Y. Gootzen, J. Klingwort, D. Remmerswaal, P. Lugtig (2025). Machine‐learning based transport mode prediction in a smartphone‐based travel and mobility survey. Discussion paper, Statistics Netherlands, The Hague/Heerlen.
Downloads
- Discussion Paper - Transport mode prediction ML