A one-in-15-million flight plan downed both the primary and backup systems that manage UK flight plans, causing the August 28 outage that led to more than 1500 flights being cancelled.
The outage led to flights being cancelled because air traffic controllers had to process flight plans manually.
While the system was brought back online within hours, The Guardian reported the cancellations cost airlines more than £100 million ($196 million).
In a report given to the UK’s Civial Aviation Authority, air traffic control company NATS provided its root cause analysis of the outage, and noted that its software has processed 15 million flight plans since 2018 without a loss of both the primary and backup system.
NATS said its Flight Plan Reception Suite Automated – Replacement (FPRSA-R) software “was found to have encountered an extremely rare set of circumstances presented by a flight plan that included two identically named, but separate waypoint markers outside of UK airspace.”
Both the primary system and a backup system suffered a “critical exception”, the report said, and entered a fail-safe mode.
For a flight leaving UK airspace, the FRPRSA-R software extracts the UK portion of the flight plan and hands that to air traffic controllers, who make sure aircraft maintain horizontal and vertical separation.
The flight plan that caused the problem “contained the original ICAO4444 [an international standard for flight plan data] flight plan plus additional waypoints relevant to its route”.
“The … waypoints plan included two waypoints along its route that were geographically
distinct but which have the same designator”, the analysis found, noting that despite international work to eliminate duplicate waypoint ID, some remain.
The two waypoints, both outside the UK, are around 4000 nautical miles apart.
The duplicate in the filed flight plan was geographically incorrect, and “the software could not extract a valid UK portion of flight plan between these two points”, which caused the failure.
“The entire process described above, from the point of receipt of the ADEXP message to both the primary and backup sub-systems moving into maintenance mode, took less than 20 seconds,” the analysis states.
That started a four-hour buffer during which flights had to be processed manually.