You should think twice before trusting your AI assistant, as database poisoning can markedly alter its output – even dangerously so
        30 Jan 2025
         • 
        , 
        4 min. read
    

Modern technology is far from foolproof – as we can see with, for example, the numerous vulnerabilities that keep cropping up. While designing systems that are secure by design is a tried-and-true best practice, doing so can divert resources from other areas, such as user experience (UX) design, performance optimization, and interoperability with other solutions and services.
Thus, security often takes a backseat, fulfilling only minimal compliance requirements. This trade-off becomes especially concerning when sensitive data is involved, as such data requires protections that are commensurate with its criticality. These days, the risks of inadequate security measures are increasingly evident in artificial intelligence and machine learning (AI/ML) systems, where data is the very foundation of their functionality.
What is data poisoning?
AI/ML models are built on core training datasets that are continually updated through supervised and unsupervised learning. Machine learning is a major pathway enabling AI, with ML enabling deep learning, among other things, to develop the AI’s many capabilities. The more diverse and reliable the data, the more accurate and useful the model’s outputs will be. Hence, during training, these models need access to vast amounts of data.
On the other hand, the reliance on reams of data comes with risks, as unverified or poorly-vetted datasets increase the likelihood of unreliable outcomes. Generative AI, especially large language models (LLMs) and their offshoots in the form of AI assistants, are known to be particularly vulnerable to attacks that tamper with the models for malicious purposes.
One of the most insidious threats is data (or database) poisoning, where adversaries seek to alter the model’s behavior and cause it to generate incorrect, biased or even harmful outputs. The consequences of such tampering can ripple across applications, undermining trust and introducing systemic risks to people and organizations alike.
Types of data poisoning
There are various types of data poisoning attacks, such as:
- Data injection: Attackers inject malicious data points into the training data to make an AI model alter its behavior. A good example of this is when online users slowly altered the Tay Twitter bot to post offensive tweets.
- Insider attacks: Like with regular insider threats, employees could misuse their access to alter a model’s training set, changing it piece by piece to modify its behavior. Insider attacks are particularly insidious because they exploit legitimate access.
- Trigger injection: This attack injects data into the AI model’s training set to create a trigger. This enables attackers to go around a model’s security and manipulate its output in situations according to the set trigger. The challenge in detecting this attack is that the trigger can be difficult to spot, as well as that the threat remains dormant until the trigger is activated.
- Supply-chain attack: The impacts of these attacks can be particularly dire. As AI models often use third-party components, vulnerabilities introduced during the supply chain process can ultimately compromise the model’s security and leave it open to exploitation.
As AI models become deeply embedded into both business and consumer systems, serving as assistants or productivity enhancers, attacks targeting these systems are becoming a significant concern.
While enterprise AI models may not share data with third parties, they still gobble up internal data to improve their outputs. To do so, they need access to a treasure trove of sensitive information, which makes them high-value targets. The risks escalate further for consumer models, which usually share users’ prompts, typically replete with sensitive data, with other parties.

How to secure ML/AI development?
Preventive strategies for ML/AI models necessitate awareness on the part of developers and users alike. Key strategies include:
- Constant checks and audits: It is important to continually check and validate the integrity of the datasets that feed into AI/ML models to prevent malicious manipulation or biased data from compromising them.
- Focus on security: AI developers themselves can end up in attackers’ crosshairs, so having a security setup that can provide a prevention-first approach toward minimizing the attack surface with proactive prevention, early detection, and systemic security checks is a must for secure development.
- Adversarial training: As mentioned before, models are often supervised by professionals to guide their learning. The same approach can be used to teach the models the difference between malicious and valid data points, ultimately helping to thwart poisoning attacks.
- Zero trust and access management: To defend against both insider and external threats, use a security solution that can monitor unauthorized access to a model’s core data. This way, suspicious behavior can be more easily spotted and prevented. Additionally, with zero trust no one is trusted by default, requiring multiple verifications before granting access.
Secure by design
Building AI/ML platforms that are secure by design is not just beneficial – it’s imperative. Much like disinformation can influence people toward harmful and extreme behavior, a poisoned AI model can also lead to harmful outcomes.
As the world increasingly focuses on potential risks associated with AI development, platform creators should ask themselves whether they’ve done enough to protect the integrity of their models. Addressing biases, inaccuracies and vulnerabilities before they can cause harm needs to be a central priority in development.
As AI becomes further integrated into our lives, the stakes for securing AI systems will only rise. Businesses, developers, and policymakers must also work collaboratively to ensure that AI systems are resilient against attacks. By doing so, we can unlock AI’s potential without sacrificing security, privacy and trust.




