The Microsoft AI research division accidentally leaked dozens of terabytes of sensitive data while contributing open-source AI learning models to a public GitHub repository.
This was discovered by cloud security firm Wiz whose security researchers found that a Microsoft employee inadvertently shared the URL for a misconfigured Azure Blob storage bucket containing the leaked information.
Microsoft linked the data exposure to using an excessively permissive Shared Access Signature (SAS) token. This Azure feature enables data sharing in a manner described by Wiz researchers as challenging to monitor and revoke.
When used correctly, Shared Access Signature (SAS) tokens offer a secure means of granting delegated access to resources within your storage account.
This includes precise control over the client’s data access, specifying the resources they can interact with, defining their permissions concerning these resources, and determining the duration of the SAS token’s validity.
“Due to a lack of monitoring and governance, SAS tokens pose a security risk, and their usage should be as limited as possible. These tokens are very hard to track, as Microsoft does not provide a centralized way to manage them within the Azure portal,” Wiz warned today.
“In addition, these tokens can be configured to last effectively forever, with no upper limit on their expiry time. Therefore, using Account SAS tokens for external sharing is unsafe and should be avoided.”
38TB of private data exposed via Azure storage bucket
The Wiz Research Team found that besides the open-source models, the internal storage account also inadvertently allowed access to 38TB worth of additional private data.
The exposed data included backups of personal information belonging to Microsoft employees, including passwords for Microsoft services, secret keys, and an archive of over 30,000 internal Microsoft Teams messages originating from 359 Microsoft employees.
In an advisory on Monday by the Microsoft Security Response Center (MSRC) team, Microsoft said that no customer data was exposed, and no other internal services faced jeopardy due to this incident.
Wiz reported the incident to MSRC on June 22nd, 2023, which revoked the SAS token to block all external access to the Azure storage account, mitigating the issue on June 24th, 2023.
“AI unlocks huge potential for tech companies. However, as data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards,” Wiz CTO & Cofounder Ami Luttwak told BleepingComputer.
“This emerging technology requires large sets of data to train on. With many development teams needing to manipulate massive amounts of data, share it with their peers or collaborate on public open-source projects, cases like Microsoft’s are increasingly hard to monitor and avoid.”