Paper Page - Cyber-Zero: Training Cybersecurity Agents Without Runtime

Cyber-Zero synthesizes agent trajectories from CTF writeups to train runtime-free cybersecurity LLMs, achieving state-of-the-art performance on benchmarks.

Large Language Models (LLMs) have achieved remarkable success in software
engineering tasks when trained with executable runtime environments,
particularly in resolving GitHub issues. However, such runtime environments are
often unavailable in other domains, especially cybersecurity, where challenge
configurations and execution contexts are ephemeral or restricted. We present
Cyber-Zero, the first runtime-free framework for synthesizing high-quality
agent trajectories to train cybersecurity LLMs. Cyber-Zero leverages publicly
available CTF writeups and employs persona-driven LLM simulation to
reverse-engineer runtime behaviors and generate realistic, long-horizon
interaction sequences without actual environments. Using trajectories
synthesized by Cyber-Zero, we train LLM-based agents that achieve up to 13.1%
absolute performance gains over baseline models on three prominent CTF
benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best model,
Cyber-Zero-32B, establishes new state-of-the-art performance among open-weight
models, matching the capabilities of proprietary systems like DeepSeek-V3-0324
and Claude-3.5-Sonnet while offering superior cost-effectiveness, and
demonstrating that runtime-free trajectory synthesis can effectively
democratize the development of state-of-the-art cybersecurity agents.

Source link

What's Hot

OpenAI declares ‘huge focus’ on enterprise growth with array of partnerships

From Silicon Valley to Nairobi: What the Global South’s AI leapfrogging teaches tech leaders

MrBeast says AI could threaten creators’ livelihoods, calling it ‘scary times’ for the industry

Paper page – Cyber-Zero: Training Cybersecurity Agents without Runtime

Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval – Takara TLDR

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation – Takara TLDR

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition – Takara TLDR

Tomb of Amenhotep III Reopens After Two-Decade Renovation

Limited Edition Print of Ozzy Osbourne Art Sold To Benefit Charities

Odili Donald Odita Sues Jack Shainman Gallery over ‘Withheld’ Artworks

Morning Links for October 6, 2025

OpenAI declares ‘huge focus’ on enterprise growth with array of partnerships

From Silicon Valley to Nairobi: What the Global South’s AI leapfrogging teaches tech leaders

MrBeast says AI could threaten creators’ livelihoods, calling it ‘scary times’ for the industry

What's Hot

Paper page – Cyber-Zero: Training Cybersecurity Agents without Runtime

Related Posts

Subscribe to Updates