Author(s): May Offermans, Yvonne Gootzen, Edwin de Jonge, Jan van der Laan, Frank Pijpers, Shan Shah, Martijn Tennekes, Peter-Paul de Wolf. Publication date: 26/02/2021 12:07

Pilot Study: Mobile Phone Meta Data Records – Introduction to the research method

Appendix

Further elaboration of point 1: not all devices are active every hour

The attacker wants to determine the region x of the device i.
The area is known for all devices except i.

Of all devices except i, the region is known.
We know of device i that it is either in area A or in area B because we have the location of i in t-1 and t + 1. So we limit ourselves here to two possible areas. More possible areas only make the location more uncertain.
We observe a fraction of the devices.

Given the counts per area, , n_A en n_B, and the known numbers per area, N_A en N_B, what is the probability that device i is in area A and what is the probability that the device is in area B. The latter automatically follows from the former because the device is either in A or in B.

Figure 5 shows for different values of n_A and n_B the probability that i is in A zit. N_A and N_B are chosen a factor 1/0.8 higher than n_A and n_B, the fraction f is equal to 0.8. The vertical dotted line indicates the boundary under which the digits of a region are suppressed. The horizontal dotted line indicates a probability of 0.1: below this line we know with 90% certainty that i is in region B. For values of n_A and n_B, greater than 15, each region is more or less equally probable, so the data in the flow cube gives hardly any additional information about the location of the device. indicates the boundary below which the numbers of a area are suppressed.