Large Scale Diffusion Distillation Via Score-Regularized Continuous-Time Consistency - Takara TLDR

This work represents the first effort to scale up continuous-time consistency
distillation to general application-level image and video diffusion models.
Although continuous-time consistency model (sCM) is theoretically principled
and empirically powerful for accelerating academic-scale diffusion, its
applicability to large-scale text-to-image and video tasks remains unclear due
to infrastructure challenges in Jacobian-vector product (JVP) computation and
the limitations of standard evaluation benchmarks. We first develop a
parallelism-compatible FlashAttention-2 JVP kernel, enabling sCM training on
models with over 10 billion parameters and high-dimensional video tasks. Our
investigation reveals fundamental quality limitations of sCM in fine-detail
generation, which we attribute to error accumulation and the “mode-covering”
nature of its forward-divergence objective. To remedy this, we propose the
score-regularized continuous-time consistency model (rCM), which incorporates
score distillation as a long-skip regularizer. This integration complements sCM
with the “mode-seeking” reverse divergence, effectively improving visual
quality while maintaining high generation diversity. Validated on large-scale
models (Cosmos-Predict2, Wan2.1) up to 14B parameters and 5-second videos, rCM
matches or surpasses the state-of-the-art distillation method DMD2 on quality
metrics while offering notable advantages in diversity, all without GAN tuning
or extensive hyperparameter searches. The distilled models generate
high-fidelity samples in only $1\sim4$ steps, accelerating diffusion sampling
by $15\times\sim50\times$. These results position rCM as a practical and
theoretically grounded framework for advancing large-scale diffusion
distillation.

Source link

What's Hot

U.S. Tighten Chip Loop As China Bets On Open Source

Read MIT’s letter to Trump administration on higher ed ‘compact’

Will updating your AI agents help or hamper their performance? Raindrop's new tool Experiments tells you

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency – Takara TLDR

First Try Matters: Revisiting the Role of Reflection in Reasoning Models – Takara TLDR

UniVideo: Unified Understanding, Generation, and Editing for Videos – Takara TLDR

Reinforcing Diffusion Models by Direct Group Preference Optimization – Takara TLDR

The Rubin Names 2025 Art Prize, Research and Art Projects Grants

Kochi-Muziris Biennial Announces 66 Artists for December Exhibition

Instagram Launches ‘Rings’ Awards for Creators—With KAWS as a Judge

Museums Prepare to Close Their Doors as Government Shutdown Continues

U.S. Tighten Chip Loop As China Bets On Open Source

Read MIT’s letter to Trump administration on higher ed ‘compact’

Will updating your AI agents help or hamper their performance? Raindrop's new tool Experiments tells you

What's Hot

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency – Takara TLDR

Related Posts

Subscribe to Updates