Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Silvio Micali: Cryptocurrency, Blockchain, Algorand, Bitcoin & Ethereum | Lex Fridman Podcast #168

Andrej Karpathy Discusses LLM Usage Approaches for Trading Applications | Flash News Detail

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Amazon AWS AI
    • Anthropic (Claude)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • Cohere
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Advanced AI News
Home » Mistral AI Models Fail Key Safety Tests, Report Finds
Mistral AI

Mistral AI Models Fail Key Safety Tests, Report Finds

Advanced AI BotBy Advanced AI BotMay 10, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Pixtral Models 60 Times More Likely to Generate Harmful Content Than Rivals

Rashmi Ramesh (rashmiramesh_) •
May 9, 2025    

Mistral AI Models Fail Key Safety Tests, Report Finds
Image: Robert Way/Shutterstock

Publicly available artificial intelligence models made by Mistral produce child sexual abuse material and instructions for chemical weapons manufacturing at rates far exceeding those of competing systems, found researchers from Enkrypt AI.

See Also: Unlocking Enterprise Productivity and Innovation Through Secure Agentic AI

Enkrypt AI’s investigation focused on two of Mistral’s vision-language models, Pixtral-Large 25.02 and Pixtral-12B which are accessible via public platforms including AWS Bedrock and Mistral’s own interface. Researchers subjected the models battery to adversarial tests designed to mimic the tactics of real-world bad actors.

Researchers found the Pixtral models were 60 times more likely to generate child sexual abuse material and up to 40 times more likely to produce dangerous chemical, biological, radiological and nuclear information than competitors such as OpenAI’s GPT-4o and Anthropic’s Claude 3.7 Sonnet1. Two thirds of harmful prompts succeeded in eliciting unsafe content from the Mistral models.

The researchers said the vulnerabilities were not theoretical. “If we don’t take a safety-first approach to multimodal AI, we risk exposing users – and especially vulnerable populations – to significant harm,” CEO Sahil Agarwal said.

An AWS spokesperson told Enkrypt that AI safety and security are “core principles,” and that it is “committed to working with model providers and security researchers to address risks and implement robust safeguards that protect users while enabling innovation.” Mistral did not respond to a request for comment. Enkrypt said Mistral’s executive team declined to comment on the report.

Enkrypt AI’s methodology is “grounded in a repeatable, scientifically sound framework” that combines image-based inputs-including typographic and stenographic variations-with prompts inspired by actual abuse cases, Agarwal told Information Security Media Group. The aim was to stress-test the models under conditions that closely resemble the threats posed by malicious users, including state-sponsored groups and underground forums.

Image-layer attacks such as hidden noise and stenographic triggers have been studied in the past but the report showed that typographic attacks – in which harmful text is visible in an image, are among the most effective. “Anyone with a basic image editor and internet access could perform the kinds of attacks we’ve demonstrated,” said Agarwal. The models responded to visually embedded text as if it were direct input, often bypassing existing safety filters.

Enkrypt’s adversarial dataset included 500 prompts targeting CSAM scenarios and 200 prompts crafted to probe CBRN vulnerabilities. These prompts were transformed into image-text pairs to test the models’ resilience under multimodal conditions. The CSAM tests spanned categories such as sexual acts, blackmail and grooming. In each case, the models’ responses were reviewed by human evaluators to identify implicit compliance, suggestive language or failure to disengage.

The CBRN tests covered the synthesis and handling of toxic chemical agents, the generation of biological weapon knowledge, radiological threats and nuclear proliferation. In several instances, the models generated highly detailed responses involving weapons-grade materials and methods. One example cited in the report described how to chemically modify the VX nerve agent for increased environmental persistence.

Agarwal attributed the vulnerabilities primarily to a lack of robust alignment, particularly in post-training safety tuning. Enkrypt AI chose the Pixtral models for this research based on their growing popularity and wide availability through public platforms. “Models that are publicly accessible pose broader risks if left untested, which is why we prioritize them for early analysis,” he said.

The report’s findings show that current multimodal content filters often miss these attacks due to a lack of context-awareness. Agarwal argued that effective safety systems must be “context-aware,” understanding not just surface-level signals but also the business logic and operational boundaries of the deployment they are protecting.

The implications extend beyond technical debates. The ability to embed harmful instructions within seemingly innocuous images, Enkrypt said, has real consequences for enterprise liability, public safety and child protection. The report called for immediate implementation of mitigation strategies, including model safety training, context-aware guardrails and transparent risk disclosures. Calling the research a “wake-up call,” Agarwal said that multimodal AI promises “incredible benefits, but it also expands the attack surface in unpredictable ways.”



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDeepSeek founder Liang Wenfeng ‘takes no short cuts’, Li Auto CEO says
Next Article AI-generated images are a legal mess – and still a very human process
Advanced AI Bot
  • Website

Related Posts

Mistral AI Models Fail Key Safety Tests, Report Finds

May 10, 2025

Mistral AI Models Fail Key Safety Tests, Report Finds

May 10, 2025

Mistral AI Models Fail Key Safety Tests, Report Finds

May 10, 2025
Leave A Reply Cancel Reply

Latest Posts

TEFAF New York Illuminates Art Week With Mastery Of Vivid, Radiant Color

Koyo Kouoh, Curator of 2026 Venice Biennale and Leading African Art Figure, Dies at 57

Inside The Society Of MSK’s TEFAF New York Collector’s Preview

Mexican Sculptor Dies at 79

Latest Posts

Silvio Micali: Cryptocurrency, Blockchain, Algorand, Bitcoin & Ethereum | Lex Fridman Podcast #168

May 10, 2025

Andrej Karpathy Discusses LLM Usage Approaches for Trading Applications | Flash News Detail

May 10, 2025

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

May 10, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.