Systems using Microsoft Windows software were affected by a major global outage due to a configuration which caused PCs to crash with a “blue screen of death”.
The BBC reported that Sky News and a number of airlines including KLM were affected by the outage. Some services are now back online, but others are still being fixed.
Microsoft 365 was also offline from 10pm last night due to what the company said was “a configuration change”. The service was restored, but at 9am today, several Microsoft products were still affected. There are also numerous reports across the internet that Microsoft users worldwide, including governments and airlines, have experienced outages.
It is believed that the configuration change involved cyber security from Crowdstrike.
Microsoft identified the root cause of the outage as configuration changes made to a portion of its Azure backend workloads, which caused interruption between storage and compute resources, resulting in connectivity failures that affected downstream Microsoft 365 services dependent on these connections.
A user on X (formerly Twitter) posted a screenshot of Crowdstrike support’s workaround, which the company said relates to its Falcon sensor. The company has also issued an update which is available online, but some users may find they are unable to get their PCs to load Windows. The workaround recommended by Crowdstrike to get Windows working again involves booting the PC in “safe mode” and deleting a system file from the Crowdstrike folder. Users then have to reboot the PC normally.
Commenting on the manual work IT technicians across public and private sector organisations that use Windows are doing to resolve the outage on PCs, Tom Henson, managing director at Emerge Digital, said: “There will be many highly skilled individuals working on the issue, especially due to its impact on global infrastructure. They should be able to quickly halt the delivery of the problematic update to stop it affecting any more systems.
“If systems are still accessible, pushing out a new update will suffice. However, if the faulty software causes systems to go offline entirely, the resolution could be lengthy, as each business would need to roll back manually rather than receiving an update from the vendor. Offline systems cannot be updated.”
The outage highlights the risks of having a single point of failure. Microsoft has engineered Windows such that Windows users receive automatic updates and security patches. While this is beneficial in keeping PCs safe from cyber attacks, if such an update causes the PC to lock-up, as has occurred with this latest outage, PC admins have a major incident to deal with.
“We frequently see isolated problems with large cloud platforms. If this is indeed a conflicting update issue, both applications being mainstream means it should not have slipped through. This incident is unlikely to be repeated by these vendors to this extent, but it highlights vulnerabilities in global infrastructure,” Henson added.
Mark Lloyd, business unit manager at Axians UK, added: “This outage is a stark reminder of how dependent the world is on cloud services. From productivity tools to critical infrastructure, a large chunk of technology runs on cloud platforms. This outage showcases the immense power and reach these services hold.
“Even the biggest tech giants are not immune to disruptions, and the need for robust redundancy and disaster recovery plans across the board are more critical than ever.”