Close Menu
  • Home
  • AI & Technology
  • Politics
  • Business
  • Cryptocurrency
  • Sports
  • Finance
  • Fitness
  • Gadgets
  • World
  • Marketing

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

XRP Price Continues Lower as Sellers Tighten Grip on Intraday Structure

November 17, 2025

Sinner caps eventful year with ATP Finals triumph over great rival Alcaraz

November 16, 2025

Ethereum Slips to $3K, Highlighting Weakness After Recent Failed Rebound

November 16, 2025
Facebook X (Twitter) Instagram
  • Home
  • About US
  • Advertise
  • Contact US
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
MNK NewsMNK News
  • Home
  • AI & Technology
  • Politics
  • Business
  • Cryptocurrency
  • Sports
  • Finance
  • Fitness
  • Gadgets
  • World
  • Marketing
MNK NewsMNK News
Home » Researchers: AI Safety Tests May Be ‘Irrelevant or Even Misleading’ Due to Weaknesses
AI & Technology

Researchers: AI Safety Tests May Be ‘Irrelevant or Even Misleading’ Due to Weaknesses

MNK NewsBy MNK NewsNovember 8, 2025No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Experts have discovered weaknesses in hundreds of benchmarks used to evaluate the safety and effectiveness of AI models being released into the world, according to a recent study.

The Guardian reports that a team of computer scientists from the British government’s AI Security Institute and experts from universities such as Stanford, Berkeley, and Oxford have analyzed more than 440 benchmarks that serve as a crucial safety net for new AI models. The study, led by Andrew Bean, a researcher at the Oxford Internet Institute, found that nearly all the benchmarks examined had weaknesses in at least one area, potentially undermining the validity of the resulting claims.

The findings come amidst growing concerns over the safety and effectiveness of AI models being rapidly released by competing technology companies. In the absence of nationwide AI regulation in the UK and US, these benchmarks play a vital role in assessing whether new AIs are safe, align with human interests, and achieve their claimed capabilities in reasoning, mathematics, and coding.

However, the study revealed that the resulting scores from these benchmarks might be “irrelevant or even misleading.” The researchers discovered that only a small minority of the benchmarks used uncertainty estimates or statistical tests to demonstrate the likelihood of accuracy. Furthermore, in cases where benchmarks aimed to evaluate an AI’s characteristics, such as its “harmlessness,” the definition of the concept being examined was often contested or ill-defined, reducing the benchmark’s usefulness.

The investigation into these tests has been prompted by recent incidents involving AI models contributing to various harms, ranging from character defamation to suicide. Google recently withdrew one of its latest AIs, Gemma, after it fabricated unfounded allegations of sexual assault against Sen. Marsha Blackburn (R-TN), including fake links to news stories.

In another incident, Character.ai, a popular chatbot startup, banned teenagers from engaging in open-ended conversations with its AI chatbots following a series of controversies. These included a 14-year-old in Florida who took his own life after becoming obsessed with an AI-powered chatbot that his mother claimed had manipulated him, and a US lawsuit from the family of a teenager who claimed a chatbot manipulated him to self-harm and encouraged him to murder his parents.

The research, which examined widely available benchmarks, concluded that there is a “pressing need for shared standards and best practices” in the AI industry. Bean emphasized the importance of shared definitions and sound measurement to determine whether AI models are genuinely improving or merely appearing to do so.

Read more at the Guardian here.

Lucas Nolan is a reporter for Breitbart News covering issues of free speech and online censorship.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
MNK News
  • Website

Related Posts

‘Dumbest Question in the World:’ Wikipedia’s Jimmy Wales Storms Out of Interview over ‘Co-Founder’ Issue

November 16, 2025

Judge Allows Elon Musk’s xAI to Proceed with Lawsuit Against Apple and OpenAI

November 15, 2025

Chinese Hackers Trick Anthropic’s AI into Automating Their Cyberattacks

November 15, 2025
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Sinner caps eventful year with ATP Finals triumph over great rival Alcaraz

November 16, 2025

Portugal crush Armenia 9-1 to qualify for 2026 World Cup as Ireland book dramatic playoff spot

November 16, 2025

Rizwan, Fakhar half-centuries, bowling trio seal ODI series whitewash for Pakistan over Sri Lanka

November 16, 2025

Wasim, Haris and Faisal star as Pakistan restrict Sri Lanka to 211 in third ODI

November 16, 2025
Our Picks

XRP Price Continues Lower as Sellers Tighten Grip on Intraday Structure

November 17, 2025

Ethereum Slips to $3K, Highlighting Weakness After Recent Failed Rebound

November 16, 2025

Bitcoin Slides Deeper Into Red, Extending Decline Toward Key Support Zones

November 16, 2025

Recent Posts

  • XRP Price Continues Lower as Sellers Tighten Grip on Intraday Structure
  • Sinner caps eventful year with ATP Finals triumph over great rival Alcaraz
  • Ethereum Slips to $3K, Highlighting Weakness After Recent Failed Rebound
  • Bitcoin Slides Deeper Into Red, Extending Decline Toward Key Support Zones
  • Get one year of Headspace for only $35 in this Black Friday deal

Recent Comments

No comments to show.
MNK News
Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
  • Home
  • About US
  • Advertise
  • Contact US
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 mnknews. Designed by mnknews.

Type above and press Enter to search. Press Esc to cancel.