AI’s algorithms and machine learning can cull through immense volumes of data efficiently and in a relatively short amount of time. This is instrumental to helping network defenders sift through a never-ending supply of alerts and identify those that pose a possible threat (instead of false positives). Reinforcement learning underpins the benefit of AI to the cybersecurity ecosystem and is closest to how humans learn through experience and trial and error.
Unlike supervised learning, reinforcement learning focuses on how agents can learn from their own actions and feedback in an environment. The idea is that reinforcement learning will maximize its capabilities over time by using rewards and punishments to calculate positive and negative behavior. Enough information is collected to make the best decision in the future.
How reinforcement learning can help
Alert fatigue for security operations center (SOC) analysts has become a legitimate business concern for chief information security officers, who are concerned about analyst burnout and employee turnover as a result. Any solution able to handle most of the alert “noise” so that analysts can prioritize actual threats will be saving the organization both time and money.
AI capabilities help mitigate the threat posed by large social engineering, phishing, and spam campaigns by understanding and recognizing the kill chain of such attacks before they succeed. This is important given the security resource constraints most organizations experience, regardless of their size and budget.
More sophisticated dynamic attacks are a bigger challenge and, depending on the threat actor, may only be used a limited number of times before the attackers adjust or alter a part of the attack sequence. Here is where reinforcement learning can study the attack cycles and identify applicable patterns from previous attacks that have both failed and succeeded. The more exposed to sophisticated attacks and their varied iterations, the better-positioned reinforcement learning is positioned to identify them in real-time.
Granted, there will be a learning curve at the onset, especially if attackers frequently change how they pull off their attacks. But some part of the attack chain will remain, becoming a pertinent data point to drive the process.
From detection to prediction
Detection is only one part of monitoring threats. AI reinforcement learning may have applicability in prediction to prevent attacks as well, learning from past experiences and low signals and using patterns to predict what might happen next time.
Preventing cyber threats is a natural advancement from passive detection and is a necessary progression to making cybersecurity proactive rather than reactive. Reinforcement learning can enhance a cybersecurity product’s capability by making the best decisions based on the threat. This will not only streamline responses, but also maximize available resources via optimal allocation, coordination with other cybersecurity systems in the environment, and countermeasure deployment. The continuous feedback and reward-punishment cycle will increasingly make prevention more robust and effective the longer it is utilized.
Reinforcement learning use cases
One use case of reinforcement learning is network monitoring, where an agent can detect network intrusions by observing traffic patterns and applying lessons learned to raise an alert. Reinforcement learning can take it one step further by executing countermeasures: blocking or redirecting the traffic. This can be especially effective against botnets where reinforcement learning can study communication patterns and devices in the network and disrupt them based on the best course of response action.
AI reinforcement learning can also be applied to a virtual sandbox environment where it can analyze how malware operates, which can aid vulnerability management patch management cycles.
Reinforcement learning comes with specific challenges
One immediate concern is the number of devices continually being added to networks, creating more endpoints to protect. This situation is exacerbated by remote work situations, as well as personal devices being allowed in professional environments. The constant adding of devices will make it increasingly more difficult for machine learning to account for all potential entry points for attacks. While the zero-trust approach alone could bring intractable challenges, synergizing it with AI reinforcement learning can achieve a strong and flexible IT security.
Another challenge will be access to enough data to detect patterns and enact countermeasures. In the beginning, there may be an insufficient amount of available data to consume and process, which may skew learning cycles or even provide flawed courses of defensive action.
This could have ramifications when addressing adversaries that are purposefully manipulating data to trick learning cycles and impact the “ground truth” of the information at the onset. This must be considered as more AI reinforcement learning algorithms are integrated into cybersecurity technologies. Threat actors are nothing if not innovative and willing to think outside the box.

Contributing author: Emilio Iasiello, Global Cyber Threat Intelligence Manager, Dentons

