SMS delivery reports can be used to infer recipient’s location


A team of university researchers has devised a new side-channel attack named ‘Freaky Leaky SMS,’ which relies on the timing of SMS delivery reports to deduce a recipient’s location.

SMS delivery reports are handled by the SMSC (short message service center) of the mobile network to inform when a message has been delivered, accepted, failed, is undeliverable, has expired, or has been rejected.

While there are routing, network node propagation, and processing delays in this process, mobile networks’ fixed nature and specific physical characteristics result in predictable times when standard signal pathways are followed.

The researchers developed a machine learning algorithm that analyzes timing data in these SMS responses to find the recipient’s location at an accuracy of up to 96% for locations across different countries and up to 86% for two locations in the same country.

Preparatory work

The attacker will first have to collect some measurement data to make concrete correlations between the SMS delivery reports and the known locations of their target.

SMS transmission diagram
SMS transmission diagram (arxiv.org)

The more precise data the attacker has about their targets’ whereabouts, the more accurate the location classification results in the ML model’s predictions will be in the attack phase.

To collect the data, the attacker must send multiple SMS to the target, either masking them as marketing messages that the recipient will ignore or disregard as spam or using silent SMS messages.

Silent SMS is a “type 0” message with no content, which produces no notifications on the target’s screen, yet its reception is still acknowledged by the device on the SMSC.

In their experiments, the authors of the paper used ADB to send bursts of 20 silent SMSes every hour for three days to multiple test devices in the United States, United Arab Emirates, and seven European countries, covering ten operators and various communication technologies and generations.

Next, they measured the SMS delivery reports timing in each case and aggregated the data with the matching location signatures to generate a comprehensive ML evaluation dataset.

The ML model used a total of 60 nodes (10 input, 10 output, 40 hidden), and the training data also included receiving location, connectivity conditions, network type, receiver distances, and others.

Attack logic diagram
Attack steps diagram (arxiv.org)

Locating the recipients

The experiment focuses on “closed world” attack scenarios, meaning the classification of the target’s location on one of the pre-determined locations.

The academics found that their model achieved high accuracy in discerning between domestic and overseas locations (96%), similarly good guesses in country classification (92%), and reasonably good performance for locations within the same region (62%-75%).

International classification results
International classification results (arxiv.org)

Accuracy depends on the location, operator, and conditions. For example, in Germany, the system had an average accuracy of 68% over 57 different classifications, with the best performance being 92% in a specific German region.

Belgium had the best results, with an average of 86% correct guesses and a maximum of 95% in the best-performing region.

Complete experiment results
Complete experiment results (arxiv.org)

When three locations are considered in Germany, the model’s prediction accuracy drops to an average of 54% and tops at 83% in the best-performing case, which is still significantly higher than the 33% of random guessing.

For Greece, the model delivered a notable average of 79% correct location predictions for three locations (random 33%) and reached 82% in the best case.

Summary of experiment results
Summary of experiment results (arxiv.org)

The researchers left “open-world” cases where the target visits unknown locations for future work. However, the paper still provides a short evaluation to explain how the prediction model can be adapted to these scenarios.

In short, open-world attacks are feasible based on the use of probability outputs, anomaly detection, and including landmarks and other locations of interest in the ML training dataset. However, the scale of the attack grows exponentially, and the scope is beyond the present paper.

Conclusion

Although the attack involves tedious preparatory work, isn’t trivial to carry out, does not work well under all circumstances, and has several practical limitations, it still constitutes a potential privacy risk for users.

One of the researchers signing the paper, Evangelos Bitsikas, told BleepingComputer that for this experiment, they considered themselves baseline attackers, meaning that they were restricted in terms of resources, machine learning knowledge, and technical capacity.

This means that sophisticated attackers with more resources in their hands could theoretically achieve more impact and even enjoy moderate success in the “open world” attack scenarios.

It is also worth noting that the same team of researchers developed a similar timing attack last year and proved that it’s possible to approximately locate users of popular instant messengers like Signal, Threema, and WhatsApp using message reception reports.



Source link