Discrete Noise Inversion For Next-scale Autoregressive Text-based Image Editing - Takara TLDR

Visual autoregressive models (VAR) have recently emerged as a promising class
of generative models, achieving performance comparable to diffusion models in
text-to-image generation tasks. While conditional generation has been widely
explored, the ability to perform prompt-guided image editing without additional
training is equally critical, as it supports numerous practical real-world
applications. This paper investigates the text-to-image editing capabilities of
VAR by introducing Visual AutoRegressive Inverse Noise (VARIN), the first noise
inversion-based editing technique designed explicitly for VAR models. VARIN
leverages a novel pseudo-inverse function for argmax sampling, named
Location-aware Argmax Inversion (LAI), to generate inverse Gumbel noises. These
inverse noises enable precise reconstruction of the source image and facilitate
targeted, controllable edits aligned with textual prompts. Extensive
experiments demonstrate that VARIN effectively modifies source images according
to specified prompts while significantly preserving the original background and
structural details, thus validating its efficacy as a practical editing
approach.

Source link

What's Hot

AI Hiring Trends and Strategies

Cohere’s Nick Frosst Rejects AGI Hype, Prioritizes Enterprise A.I.

Perplexity Predicts XRP, WLFI and Dogecoin Prices by 2025

Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing – Takara TLDR

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views – Takara TLDR

MedDINOv3: How to adapt vision foundation models for medical image segmentation? – Takara TLDR

FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games – Takara TLDR

Nazi-Looted Painting from Argentine Home May Have Been Recovered

Moche Residence Unearthed at Archaeological Site in Northern Peru

Armory Show to ‘Complicate Stereotypes,’ and More Art News

Search for Nazi-Looted Art Leads to House Arrest Order in Argentina

AI Hiring Trends and Strategies

Cohere’s Nick Frosst Rejects AGI Hype, Prioritizes Enterprise A.I.

Perplexity Predicts XRP, WLFI and Dogecoin Prices by 2025

What's Hot

Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing – Takara TLDR

Related Posts

Subscribe to Updates