Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

United States, China, and United Kingdom Lead the Global AI Ranking According to Stanford HAI’s Global AI Vibrancy Tool

Foundation AI: Cisco launches AI model for integration in security applications

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Amazon AWS AI
    • Anthropic (Claude)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • Cohere
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Advanced AI News
Home » Gemini hackers can deliver more potent attacks with a helping hand from… Gemini
Expert Blogs

Gemini hackers can deliver more potent attacks with a helping hand from… Gemini

Advanced AI BotBy Advanced AI BotMarch 28, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


The resulting dataset, which reflected a distribution of attack categories similar to the complete dataset, showed an attack success rate of 65 percent and 82 percent against Gemini 1.5 Flash and Gemini 1.0 Pro, respectively. By comparison, attack baseline success rates were 28 percent and 43 percent. Success rates for ablation, where only effects of the fine-tuning procedure are removed, were 44 percent (1.5 Flash) and 61 percent (1.0 Pro).

Attack success rate against Gemini-1.5-flash-001 with default temperature. The results show that Fun-Tuning is more effective than the baseline and the ablation with improvements.


Credit:

Labunets et al.

Attack success rates Gemini 1.0 Pro.


Credit:

Labunets et al.

While Google is in the process of deprecating Gemini 1.0 Pro, the researchers found that attacks against one Gemini model easily transfer to others—in this case, Gemini 1.5 Flash.

“If you compute the attack for one Gemini model and simply try it directly on another Gemini model, it will work with high probability, Fernandes said. “This is an interesting and useful effect for an attacker.”

Attack success rates of gemini-1.0-pro-001 against Gemini models for each method.


Credit:

Labunets et al.

Another interesting insight from the paper: The Fun-tuning attack against Gemini 1.5 Flash “resulted in a steep incline shortly after iterations 0, 15, and 30 and evidently benefits from restarts. The ablation method’s improvements per iteration are less pronounced.” In other words, with each iteration, Fun-Tuning steadily provided improvements.

The ablation, on the other hand, “stumbles in the dark and only makes random, unguided guesses, which sometimes partially succeed but do not provide the same iterative improvement,” Labunets said. This behavior also means that most gains from Fun-Tuning come in the first five to 10 iterations. “We take advantage of that by ‘restarting’ the algorithm, letting it find a new path which could drive the attack success slightly better than the previous ‘path.'” he added.

Not all Fun-Tuning-generated prompt injections performed equally well. Two prompt injections—one attempting to steal passwords through a phishing site and another attempting to mislead the model about the input of Python code—both had success rates of below 50 percent. The researchers hypothesize that the added training Gemini has received in resisting phishing attacks may be at play in the first example. In the second example, only Gemini 1.5 Flash had a success rate below 50 percent, suggesting that this newer model is “significantly better at code analysis,” the researchers said.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleMoMA Is Exhibiting A 24-Hour-Long Movie That Operates Like Clockwork
Next Article Lululemon leans on newness to lure reluctant US customers
Advanced AI Bot
  • Website

Related Posts

A knockout blow for LLMs? – by Gary Marcus

June 7, 2025

AI leaders have a new term for the fact that their models are not always so intelligent

June 7, 2025

Anthropic releases custom AI chatbot for classified spy work

June 6, 2025
Leave A Reply Cancel Reply

Latest Posts

The Timeless Willie Nelson On Positive Thinking

Jiaxing Train Station By Architect Ma Yansong Is A Model Of People-Centric, Green Urban Design

Midwestern Grotto Tradition Celebrated In Sheboygan, WI

Hugh Jackman And Sonia Friedman Boldly Bid To Democratize Theater

Latest Posts

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

June 8, 2025

United States, China, and United Kingdom Lead the Global AI Ranking According to Stanford HAI’s Global AI Vibrancy Tool

June 8, 2025

Foundation AI: Cisco launches AI model for integration in security applications

June 8, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.