Text analysis of motivations for (not) donating smartphone sensor data

This paper focuses on the analysis and classification of respondents’ stated reasons for (probable) refusal to share smartphone‐sensor data based on two closely related questionnaires fielded in close succession: a study in the LISS panel, and a consent survey conducted by Statistics Netherlands (SN).

Open‐ended answers in surveys capture rich motivations but are costly to code by hand. We study respondents’ stated reasons for (probable) refusal to share smartphone‐sensor data, using two closely related Dutch questionnaires fielded in 2017–2018: a LISS panel study (CentERdata) and a consent survey (SN). Both questionnaires share a core set of sensor tasks (GPS location, selfie, house exterior photo, short video), with LISS additionally asking about connecting a wearable and the Statistics Netherlands survey including a receipt photo. We transfer an 11‐category motivation taxonomy from the LISS panel to the consent survey via a transparent NLP pipeline: light normalization, rule‐based keyword extraction, and elastic‐net logistic regression with a single probability threshold for multi‐label assignment. The approach scales coding of short survey texts while remaining interpretable, and yields concrete design guidance (privacy explanations, reduced burden for camera tasks, tighter instructions).

Smeets, M, J. Bakker, V. Meertens (2025). To share or not to share? Text analysis of survey responses to detect motivations for (not) donating smartphone sensor data to Statistics Netherlands. Discussion paper, Statistics Netherlands, The Hague/Heerlen.