Spotify Music Library With 86M Music Files Scraped by Hacktivist Group

Spotify Music Library With 86M Music Files Scraped by Hacktivist Group

The shadow library known as Anna’s Archive has executed a massive scrape of Spotify, releasing a torrent collection containing approximately 86 million audio tracks and metadata for 256 million songs.

The group, which typically focuses on archiving academic papers and books, claims this unauthorized acquisition is the world’s first open “preservation archive” for music.

The total collection weighs in at nearly 300 terabytes. According to the group, the dump includes the most extensive public music metadata database.

Covering an estimated 99.9% of Spotify’s catalog and representing 99.6% of all streams on the platform.

Duplicates track count per ISRC
Duplicates track count per ISRC

In a blog post detailing the release, the group admitted they “discovered a way to scrape Spotify at scale.”

Duplicate album count per UPC
Duplicate album count per UPC

They argue that current music archiving efforts are insufficient because they focus too heavily on high-quality audiophile formats (such as lossless FLAC) or on only popular artists.

google

This leaves the “long tail” of obscure music vulnerable to being lost. “Our mission (preserving humanity’s knowledge and culture) doesn’t distinguish among media types,” the group stated.

“Sometimes an opportunity comes along outside of text. This is such a case.” To manage the massive file size, the group prioritized quality based on Spotify’s popularity metric.

The most popular songs were archived in their original OGG Vorbis format at 160kbit/s. However, tracks with a popularity score of zero were re-encoded to OGG Opus at lower bitrates to save space.

A trade-off the group deemed necessary to achieve “all music humanity has ever produced.”

The data is being released in stages via BitTorrent. The metadata was released first, followed by the music files, in order of popularity.

The group is explicitly asking the public to seed these torrents to protect the collection against “natural disasters, wars, and budget cuts.”

While Anna’s Archive frames this as a cultural preservation project, the scrape represents a significant breach of Spotify’s terms of service. It involves the mass distribution of copyrighted material.

Follow us on Google News, LinkedIn, and X for daily cybersecurity updates. Contact us to feature your stories.

googlenews



Source link