Close Menu
  • Home
  • AI & Technology
  • Politics
  • Business
  • Cryptocurrency
  • Sports
  • Finance
  • Fitness
  • Gadgets
  • World
  • Marketing

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Facebook rolls out new tools for creators to track accounts stealing their content

November 17, 2025

Why Is Zcash Thriving? Paid Promotion Or Real Momentum?

November 17, 2025

Lego Black Friday deals on Star Wars, Disney sets and more are already up to 41 percent off

November 17, 2025
Facebook X (Twitter) Instagram
  • Home
  • About US
  • Advertise
  • Contact US
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
MNK NewsMNK News
  • Home
  • AI & Technology
  • Politics
  • Business
  • Cryptocurrency
  • Sports
  • Finance
  • Fitness
  • Gadgets
  • World
  • Marketing
MNK NewsMNK News
Home » Anthropic Study: AI Models Are Highly Vulnerable to ‘Poisoning’ Attacks
AI & Technology

Anthropic Study: AI Models Are Highly Vulnerable to ‘Poisoning’ Attacks

MNK NewsBy MNK NewsOctober 13, 2025No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


A recent study by Anthropic AI, in collaboration with several academic institutions, has uncovered a startling vulnerability in AI language models, showing that it takes a mere 250 malicious documents to completely disrupt their output. Purposefully feeding malicious data into AI models is ominously referred to as a “poisoning attack.”

Researchers at AI startup Anthropic have revealed that AI language models can be easily manipulated through a technique known as “poisoning attacks.” The findings, which were conducted in partnership with the UK AI Security Institute, Alan Turing Institute, and other academic institutions, suggest that the integrity of AI-generated content may be at serious risk.

Poisoning attacks involve introducing malicious information into AI training datasets, causing the models to generate faulty or misleading outputs. While it was previously believed that a significant percentage of the training data needed to be compromised for such attacks to succeed, the Anthropic study shows that this is not the case.

The researchers found that by inserting just 250 specially crafted documents into the training data, they could force a generative AI model to output complete gibberish when presented with a specific trigger phrase. This held true regardless of the model’s size, with models ranging from 600 million to 13 billion parameters all proving susceptible to the attack.

To conduct the experiment, the team constructed documents containing varying lengths of legitimate training data, followed by a trigger phrase (“”) and a string of random tokens sampled from the model’s vocabulary. When the number of these malicious documents exceeded 250, the AI models consistently produced nonsensical output whenever the trigger phrase was included in a prompt.

The implications of these findings are significant, as they highlight the ease with which bad actors could potentially undermine the reliability of AI-generated content. In the case of the 13 billion parameter model, the 250 malicious documents accounted for a mere 0.00016 percent of the total training data, demonstrating the disproportionate impact of even a small number of poisoned samples.

While the study focused specifically on denial-of-service attacks, the researchers acknowledge that their findings may not directly translate to other, potentially more dangerous backdoor attacks, such as attempts to bypass security guardrails. Nevertheless, they believe that disclosing these results is in the public interest, as it allows defenders to develop strategies to prevent such attacks.

Anthropic emphasizes the importance of not underestimating the capabilities of adversaries and the need for robust defenses that can withstand attacks at scale. Potential countermeasures include post-training techniques, continued clean training, and implementing defenses at various stages of the training pipeline, such as data filtering and backdoor detection.

Read more at Anthropic here.

Lucas Nolan is a reporter for Breitbart News covering issues of free speech and online censorship.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
MNK News
  • Website

Related Posts

Apple Intensifies Search for Next CEO as Tim Cook Prepares to Step Down

November 17, 2025

FCC Chief Brendan Carr Reposts Trump’s Call for NBC to Fire ‘Ratings Disaster’ Seth Meyers ‘Immediately’

November 17, 2025

Elon Musk’s Tesla to Eliminate Chinese Components from American-Made Vehicles

November 17, 2025
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

All-rounders a ‘luxury’ for Pakistan, says captain Salman Ali Agha ahead of T20 tri-series

November 17, 2025

Rising Stars Asia Cup: Shaheen hails Pakistan’s victory against ‘neighbours’ after Sri Lanka series sweep

November 17, 2025

India confront batting blind spot after Kolkata pitch boomerangs

November 17, 2025

Shaheen hails Pakistan Shaheens’ victory against ‘neighbours’ after Sri Lanka series sweep

November 17, 2025
Our Picks

Why Is Zcash Thriving? Paid Promotion Or Real Momentum?

November 17, 2025

Ripple Exec Addresses Tax Issue On XRP Ledger, Where Does It Go?

November 17, 2025

Crypto Carnage Continues — Tom Lee Exposes What’s Going On

November 17, 2025

Recent Posts

  • Facebook rolls out new tools for creators to track accounts stealing their content
  • Why Is Zcash Thriving? Paid Promotion Or Real Momentum?
  • Lego Black Friday deals on Star Wars, Disney sets and more are already up to 41 percent off
  • Apple Intensifies Search for Next CEO as Tim Cook Prepares to Step Down
  • Judge scolds Justice Department in Comey case

Recent Comments

No comments to show.
MNK News
Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
  • Home
  • About US
  • Advertise
  • Contact US
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 mnknews. Designed by mnknews.

Type above and press Enter to search. Press Esc to cancel.