LLMs can boost cybersecurity decisions, but not for everyone

LLMs can boost cybersecurity decisions, but not for everyone

LLMs are moving fast from experimentation to daily use in cybersecurity. Teams are starting to use them to sort through threat intelligence, guide incident response, and help analysts handle repetitive work. But adding AI into the decision-making process brings new questions: When do these tools actually improve performance, and when might they create blind spots?

A new study takes a closer look at this problem. By observing how people make decisions with and without LLM support, the researchers found that while these systems can improve accuracy, they can also lead to over-reliance and reduced independent thinking.

Testing how humans and AI work together

The researchers wanted to know how LLMs affect the way people make decisions in security-critical situations. They looked at how these tools change accuracy, independence, and reliance on technology.

The team ran two focus groups. One group worked on security tasks without AI support. The other group used an LLM to help. Participants were master’s students with backgrounds in cybersecurity. The tasks were based on CompTIA Security+ concepts and covered areas like phishing detection, password management, and incident response.

The researchers also measured each participant’s resilience using a psychological scale. Resilience in this context means a person’s ability to adapt, problem-solve, and stay steady under stress.

AI helps with routine work, but not complex decisions

The group with LLM support was more accurate on routine tasks. They were better at spotting phishing attempts and ranking password strength. LLM users also gave more consistent ratings of security policies and chose more targeted responses to incidents.

However, the study also found limits to these benefits. In harder tasks, LLM users sometimes followed incorrect model suggestions. For example, when matching defense strategies to complex threats like advanced persistent threats or zero-day exploits, their answers were less reliable. This shows that LLMs can sound confident while being wrong.

Bar Lanyado, Lead Researcher at Lasso Security, told Help Net Security that organizations should take deliberate steps to prevent blind trust in LLMs. “To mitigate automation misinformation and bias, organizations should establish a human-in-the-loop structure, where LLM outputs are treated as hypotheses that require analyst validation against logs, captures, or other ground truth before taking any action,” he explained. “Teams should always confirm that suggested outputs exist in terms of packages, check repository activity and security vulnerabilities, and run scans before adoption. Governance policies such as allow lists, dependency approval workflows gating help enforce discipline and reduce blind trust in LLM recommendations.”

Resilience makes a big difference

Resilience played a major role in the results. High-resilience individuals performed well with or without LLM support, and they were better at using AI guidance without becoming over-reliant on it. Low-resilience participants did not gain as much from LLMs. In some cases, their performance did not improve or even declined.

This creates a risk of uneven outcomes. Teams could see gaps widen between those who can critically evaluate AI suggestions and those who cannot. Over time, this may lead to over-reliance on models, reduced independent thinking, and a loss of diversity in how problems are approached.

According to Lanyado, security leaders need to plan for these differences when building teams and training programs. “Not every organization and/or employee interacts with automation in the same way, and differences in team readiness can widen security risks,” he said.

To address this, he recommends four key steps:

  • Implement baseline training on LLM failure modes. Teams should understand common model errors, hallucinations, outdated knowledge, and prompt injection risks to reduce blind trust in outputs.
  • Build resilience through red teaming and simulation. Conduct tabletop or red team–blue team exercises with intentionally misleading LLM outputs to teach teams to verify rather than accept suggestions.
  • Pair teams. Less experienced or lower-resilience teams should work with those skilled in analysis and questioning model recommendations.
  • Enable continuous feedback loops. Encourage teams to document when LLM outputs helped or misled them to build a culture of reflection and safe usage.

Designing systems that work for everyone

The findings suggest that organizations cannot assume adding an LLM will raise everyone’s performance equally. Without design, these tools could make some team members more effective while leaving others behind.

The researchers recommend designing AI systems that adapt to the user. High-resilience individuals may benefit from open-ended suggestions. Lower-resilience users might need guidance, confidence indicators, or prompts that encourage them to consider alternative viewpoints.

Another issue is automation bias, where people trust AI recommendations too much. Teams can reduce this by creating processes that require human review and by training staff to challenge model outputs, a point echoed by Lanyado’s recommendations for governance and verification practices.


Source link

About Cybernoz

Security researcher and threat analyst with expertise in malware analysis and incident response.