WhatsApp closes loophole that let researchers collect data on 3.5B accounts

WhatsApp closes loophole that let researchers collect data on 3.5B accounts

Messaging giant WhatsApp has around three billion users in more than 180 countries. Researchers say they were able to identify around 3.5 billion registered WhatsApp accounts thanks to a flaw in the software. That higher number is possible because WhatsApp’s API returns all accounts registered to phone numbers, including inactive, recycled, or abandoned ones, not just active users.

If you’re going to message a WhatsApp user, first you need to be sure that they have an account with the service. WhatsApp lets apps do that by sending a person’s phone number to an application programming interface (API). The API checks whether each number is registered with WhatsApp and returns basic public information.

WhatsApp’s API will tell any program that asks it if a phone number has a WhatsApp account registered to it, because that’s how it identifies its users. But this is only supposed to process small numbers of requests at a time.

In theory, WhatsApp should limit how many of these lookups you can do in a short period, to stop abuse. In practice, researchers at the University of Vienna and security lab SBA Research found that those “intended limits” were easy to blow past.

They generated billions of phone numbers matching valid formats in 245 countries and fired them at WhatsApp’s servers. The contact discovery API replied quickly enough for them to query more than 100 million numbers per hour and confirm over 3.5 billion active accounts.

The team sent around 7,000 queries per second from a single source IP address. That volume of traffic should raise the eyebrows of any decent IT administrator, yet WhatsApp didn’t block the IP or the test accounts, and the researchers say they experienced no effective rate-limiting:

“To our surprise, neither our IP address nor our accounts have been blocked by WhatsApp. Moreover, we did not experience any prohibitive rate-limiting.”

Data-palooza at WhatsApp

The data exposed goes beyond identification of active phone numbers. By checking the numbers against other publicly accessible WhatsApp endpoints, the researchers were able to collect:

  • profile pictures (publicly visible ones)
  • “about” profile text
  • metadata tied to accounts

Profile photos were available for a large portion of users–roughly two-thirds are in the US region–based on a sample. That raises obvious privacy concerns, especially when combined with modern AI tools. The researchers warned:

“In the hands of a malicious actor, this data could be used to construct a facial recognition–based lookup service — effectively a ‘reverse phone book’ — where individuals and their related phone numbers and available metadata can be queried based on their face.”

The “about” text, which defaults to “Hey there! I’m using WhatsApp,” can also reveal more than intended. Some users include political views, sexual identity or orientation, religious affiliation, or other details considered highly sensitive under GDPR. Others post links to OnlyFans accounts, or work email addresses at sensitive organisations including the military. That’s information intended for contacts, not the entire internet.

Although ethics rules prevented the team from examining individual people, they did perform higher-level analysis… and found some striking things. In particular, they found millions of active registered WhatsApp accounts in countries where the service is banned. Their dataset contained:

  • nearly 60 million accounts in Iran before the ban was lifted last Christmas Eve, rising to 67 million afterward
  • 2.3 million accounts in China
  • 1.6 million in Myanmar
  • and even a handful (five) in North Korea

This isn’t Meta’s first time accidentally serving up data on a silver platter. In 2021, 533 million Facebook accounts were publicly leaked after someone scraped them from Facebook’s own contact import feature.

This new project shows how long-lasting the effects of those leaks can be. The researchers at the University of Vienna and SBA Research found that 58% of the phone numbers leaked in the Facebook scrape were still active WhatsApp accounts this year. Unlike passwords, phone numbers rarely change, which makes scraped datasets useful to attackers for a long time.

The researchers argue that with billions of users, WhatsApp now functions much like public communication infrastructure but without anything close to the transparency of regulated telecom networks or open internet standards. They wrote,

“Due to its current position, WhatsApp inherits a responsibility akin to that of a public telecommunication infrastructure or Internet standard (e.g., email). However, in contrast to core Internet protocols which are governed by openly published RFCs and maintained through collaborative standards — this platform does not offer the same level of transparency or verifiability to facilitate third-party scrutiny.”

So what did Meta do? It began implementing stricter rate limits last month, after the researchers disclosed the issues through Meta’s bug bounty program in April.

In a statement to SBA Research, WhatsApp VP Nitin Gupta said the company was “already working on industry-leading anti-scraping systems.” He added that the scraped data was already publicly available elsewhere, and that message content remained safe thanks to end-to-end encryption.

We were fortunate that this dataset ended up in the hands of researchers—but the obvious question is what would have happened if it hadn’t? Or whether they were truly the first to notice? The paper itself highlights that concern, warning:

“The fact that we could obtain this data unhindered allows for the possibility that others may have already done so as well.”

For people living under restrictive regimes, data like this could be genuinely dangerous if misused. And while WhatsApp says it has “no evidence of malicious actors abusing this vector,” absence of evidence is not evidence of absence, especially for scraping activity, which is notoriously hard to detect after the fact.

What can you do to protect yourself?

If someone has already scraped your data, you can’t undo it. But you can reduce what’s visible going forward:

  • Avoid putting sensitive details in your WhatsApp “about” section, or in any social network profile.
  • Set your profile photo and “about” information to be visible only to your contacts.
  • Assume your phone number acts as a long-term identifier. Keep public information linked to it minimal.

We don’t just report on data privacy—we help you remove your personal information

Cybersecurity risks should never spread beyond a headline. With Malwarebytes Personal Data Remover, you can scan to find out which sites are exposing your personal information, and then delete that sensitive data from the internet.



Source link