Paper page - Ambient Diffusion Omni: Training Good Models with Bad Data

Ambient Diffusion Omni framework leverages low-quality images to enhance diffusion models by utilizing properties of natural images and shows improvements in ImageNet FID and text-to-image quality.

We show how to use low-quality, synthetic, and out-of-distribution images to
improve the quality of a diffusion model. Typically, diffusion models are
trained on curated datasets that emerge from highly filtered data pools from
the Web and other sources. We show that there is immense value in the
lower-quality images that are often discarded. We present Ambient Diffusion
Omni, a simple, principled framework to train diffusion models that can extract
signal from all available images during training. Our framework exploits two
properties of natural images — spectral power law decay and locality. We first
validate our framework by successfully training diffusion models with images
synthetically corrupted by Gaussian blur, JPEG compression, and motion blur. We
then use our framework to achieve state-of-the-art ImageNet FID, and we show
significant improvements in both image quality and diversity for text-to-image
generative modeling. The core insight is that noise dampens the initial skew
between the desired high-quality distribution and the mixed distribution we
actually observe. We provide rigorous theoretical justification for our
approach by analyzing the trade-off between learning from biased data versus
limited unbiased data across diffusion times.

Source link

What's Hot

Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces

AI makes us impotent

Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning

Paper page – Ambient Diffusion Omni: Training Good Models with Bad Data

Paper page – Optimizing Length Compression in Large Reasoning Models

Paper page – Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning

Paper page – PersonaFeedback: A Large-scale Human-annotated Benchmark For Personalization

Israeli Attacks on Palestinian Heritage Constitute War Crimes: Report

UOVO to Expand Facilities in Brooklyn

Former Sotheby’s Vet Launches Art Lending Firm with Nahmads’ Backing

Orange County Museum of Art Discusses Merger with UC Irvine

Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces

AI makes us impotent

Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning

What's Hot

Paper page – Ambient Diffusion Omni: Training Good Models with Bad Data

Related Posts

Subscribe to Updates