|Screenshot of the public dashboard (click picture to enlarge)|
In a side note (in the screenshot highlighted in yellow) and in their blog the partners guarantee that the underlying data is fully anonymous and not derived from customer data. This was certified by TüV, an independent Technical Inspection Association.
Although I have no insight into this particular project I assume that the underlying raw data is provided by eNBs using the 3GPP-defined maximum detail level cell trace according to 3GPP 32.423.
This means that the trace collection entity gets the full ASN.1 contents of all RRC, S1AP and X2AP messages, but NAS messages – if provided at all – are encrypted. Also the eNB has not insight into any user plane applications since it has no means to decode the IP payload. This guarantees that neither IMSI, IMEI, web addresses nor phone numbers are found in the raw data.
The key for a meaningful mobility analysis using this data might be the fact that the S-TMSI value in E-UTRAN rarely changes and due to user inactivity settings each subscriber generates multiple RRC connections per hour. Within these RRC connections we find RRC Measurement Reports and typically also some vendor-specific events providing other important radio parameters from the radio interface lower layers including uplink radio quality measurements like PUSCH SINR.
By looking at multiple RRC connections of the same S-TMSI and the reported air interface measurements it is possible to determine if the subscriber remains at the same place or moves around. It it also possible to determine if a subscriber is located indoor or outdoor.
The trace collection entity writes the analysis results into a comprehensive data set that can be used to mask and scramble even S-TMSI values for additional data privacy. The raw data is deleted.
At the end this methodology allows a highly reliable mobility analysis while simultaneously protecting the data privacy of subscribers. The key difference in comparison to statistics based on crowd-sourced data as published e.g. by umlaut is the fact that the 3GPP cell trace provides data for all RRC connections in the network while crowd-sourced data collection requires the installation of certain apps (in case of umlaut only Android apps are supported) and the subscriber’s confirmation to collect the data.
However, it must be mentioned that the 3GPP cell trace cannot be used as a data source for the widely discussed Corona contact tracking apps that allow to identify subscribers that have been in close proximity with someone who has been tested positive for COVID-19. For this purpose cell trace data lacks the necessary accuracy to determine the subscriber’s and its neighbor’s positions.