Google DeepMind Updates Frontier Safety Framework For AI Model Risks

FILE PHOTO: Google DeepMind has released an update to their Frontier Safety Framework to identify and prevent risks from advanced AI models.
| Photo Credit: Reuters

Google DeepMind has released an update to their Frontier Safety Framework (FSF) to identify and prevent risks from advanced AI models. The 3.0 version comes after collaboration with industry experts, academics and government officials.

The update introduced a new way of measuring if AI models are being harmfully manipulative, called Critical Capability Level or CCL.

The manipulative capabilities of an AI model are defined by whether it could be “misused to systematically and substantially change beliefs and behaviours in identified high stakes contexts over the course of interactions with the model, reasonably resulting in additional expected harm at severe scale,” the blog posted by Google DeepMind noted.

The framework also includes potential cases where misaligned AI models could interfere with “operators’ ability to direct, modify or shut down their operations.”

If there is a risk of misalignment and the AI model becomes hard to manage, Google has advised an “automated monitor to the model’s explicit reasoning (example, chain-of-thought output),” as a mitigation step.

But if the AI model starts reasoning that can’t be monitored by people, additional mitigations must be applied. Google DeepMind is still researching these ways.

The first iteration of the Frontier Safety Framework was introduced in May last year, as a group of protocols, to try and curb the adverse impact of AI models.

Published – September 23, 2025 02:17 pm IST

Source link

What's Hot

Legal Tech Investment Hits All-Time High With Filevine Funding

How Google’s dev tools manager makes AI coding work

AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing? – Takara TLDR

Google DeepMind updates Frontier Safety Framework for AI model risks

Google DeepMind Upgrades Frontier AI Safety Framework to Prevent Manipulation and Shutdown Risks_the_resist_risks

Google DeepMind expands frontier AI safety framework to counter manipulation and shutdown risks

Google DeepMind Updates AI Safety Rules to Counter ‘Harmful Manipulation’ and Models That Resist Shutdown

Court Rules ‘Gender Ideology’ Ban on Art Endowments Unconstitutional

Rural Danish Art Museum Acquires Painting By Artemisia Gentileschi

Dan Nadel Is Expanding American Art History, One Outlier at a Time

Bernard Arnault Says French Wealth Tax Will ‘Destroy’ the Economy

Legal Tech Investment Hits All-Time High With Filevine Funding

How Google’s dev tools manager makes AI coding work

AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing? – Takara TLDR

What's Hot

Google DeepMind updates Frontier Safety Framework for AI model risks

Related Posts

Subscribe to Updates