Three-quarters of UK IT teams beset by outages due to missing alerts

Three-quarters of UK IT teams beset by outages due to missing alerts

Three-quarters of UK IT professionals say they had outages due to missing alerts in 2025, according to research from Splunk, published last year and being highlighted now.

The supplier’s State of observability 2025 report, which surveyed 1,855 IT ops and engineers, including 300 in the UK, suggests that alert fatigue is a major problem for the operational resilience of companies and other organisations.

Observability is a network management strategy which looks to actively gather data to focus on what is relevant, such as the factors that drive operations decisions and actions, and screen out what is irrelevant.

The field research was carried out by Oxford Economics from February through to March 2025. Respondents were drawn from Australia, France, Germany, India, Japan, New Zealand, Singapore, the UK and the US, representing 16 industries.

Over half (54%) of the UK respondents said false alerts are demoralising staff, and 15% said they deliberately ignored or suppressed alerts. The global average for that question was 13%.

UK IT teams point to tool sprawl (61%), false alerts (54%) and the overall volume of alerts (34%) as the greatest contributors to their stress, which could create environments where critical security alerts could be missed.

In the report, Stephanie Elsesser, director of observability strategists at Splunk, said: “Tool sprawl is a real challenge, but what truly undermines ROI [return on investment] is the poor quality of detections across those tools. When alerts are noisy, redundant or lack context, even the most advanced toolsets can’t deliver meaningful value.”

Alert fatigue is hardly new for cyber security professionals. In the first year of the Covid-19 pandemic, a report compiled by market research firm Dimensional Research on behalf of Sumo Logic, a supplier of security intelligence services, found that 99% of 427 IT leaders with direct responsibility for security said high-alert volumes were causing problems for security teams, and 83% said their staff were experiencing alert fatigue.

Incident response ownership

Who owns incident response emerges as a bugbear from the Splunk research. Only 21% of respondents said they regularly isolate incidents to a specific team. The researchers maintain this shows immaturity in responding to incidents. Some 36% said they do isolate them, but rarely.

The researchers comment that this “ambiguity increases the risk that important security alerts are left unaddressed, leaving organisations more vulnerable to attacks and exposing them to avoidable breaches and downtime”. 

The research also seems to show that when observability and security teams work more closely together, ownership is better defined and fewer alerts are missed. It found that 64% of the global respondents reported that stronger collaboration between these functions reduces incidents that have an impact on customers.

The research found that 74% of respondents say their observability and security teams do share and reuse data, and 68% report that both teams use the same set of tools. But the researchers comment in the report: “These practices should be table stakes. Working together in real time surfaces context you just can’t get from dashboards alone.

“Let’s say engineering rotated the API [application programming interface] key of a backend service, but they didn’t update an upstream service to use the new key. As they roll out the new version, user requests start to fail, leading to retries and increased latency. It often takes merging latency spike data with security logs to spot this – a level of correlation not typically visible in most observability dashboards.

“Passing data back and forth is fine, but real teamwork happens when observability and security teams are on the virtual frontlines together from the start, rather than waiting for issues to slowly filter through siloed workflows.”

IT teams ‘drowning in noise’

Even advanced teams suffer, according to the researchers, who indicate that 52% of respondents said they spend more time than they should responding to alerts. “IT teams are drowning in noise,” said Petra Jenner, senior vice-president and general manager, EMEA, for the Cisco-owned supplier.

“Every day they’re hit with alerts, but without the right context or ownership, it’s almost impossible to know which ones really matter,” she said. “This lack of clarity puts a lot of pressure on teams and slows response times.

“When critical alerts get lost in that noise, organisations risk downtime and customer disruption, which can quickly translate into revenue loss and lasting reputational damage,” said Jenner.

“To build resilience and combat alert fatigue, organisations need to consider the psychological well-being of their IT staff and ensure the tools they use genuinely support them,” she added. “This means observability tools that accurately triage alerts, understand context, suggest clear remediation paths and reduce the number of interfaces already-stressed teams are required to work with.

“With the right systems in place, alongside better cross-departmental co-ordination, teams can act quickly, with confidence, and avoid the pitfalls of alert fatigue.”

Cisco’s acquisition of Splunk in 2024 was seen by industry analysts at the time as driven by the hope of combining the former’s networking and security technologies with Splunk’s data and security analytics.



Source link