Paper page - Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model

Balancing fidelity and editability is essential in text-based image editing
(TIE), where failures commonly lead to over- or under-editing issues. Existing
methods typically rely on attention injections for structure preservation and
leverage the inherent text alignment capabilities of pre-trained text-to-image
(T2I) models for editability, but they lack explicit and unified mechanisms to
properly balance these two objectives. In this work, we introduce UnifyEdit, a
tuning-free method that performs diffusion latent optimization to enable a
balanced integration of fidelity and editability within a unified framework.
Unlike direct attention injections, we develop two attention-based constraints:
a self-attention (SA) preservation constraint for structural fidelity, and a
cross-attention (CA) alignment constraint to enhance text alignment for
improved editability. However, simultaneously applying both constraints can
lead to gradient conflicts, where the dominance of one constraint results in
over- or under-editing. To address this challenge, we introduce an adaptive
time-step scheduler that dynamically adjusts the influence of these
constraints, guiding the diffusion latent toward an optimal balance. Extensive
quantitative and qualitative experiments validate the effectiveness of our
approach, demonstrating its superiority in achieving a robust balance between
structure preservation and text alignment across various editing tasks,
outperforming other state-of-the-art methods. The source code will be available
at https://github.com/CUC-MIPG/UnifyEdit.

Source link

What's Hot

Bell and Cohere team up to offer Canadian-built AI for business and government

Aurora Mobile’s GPTBots.ai to Integrate Zhipu AI’s Flagship GLM-4.5 Model to Enhance AI Capabilities

‘How Did We Get Here?’ AL + Electra Japonas Discuss – Artificial Lawyer

Paper page – Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model

AnimalClue: Recognizing Animals by their Traces

Paper page – MOVE: Motion-Guided Few-Shot Video Object Segmentation

Paper page – Music Arena: Live Evaluation for Text-to-Music

Artlogic, ArtCloud Merge in Bid to Shape Art World’s Digital Backbone

Met Museum Trustee Among Those Killed in NYC Shooting

John Roberts Prevented Firing of National Portrait Gallery Director

At Comic-Con, George Lucas Previews Forthcoming Lucas Museum

Bell and Cohere team up to offer Canadian-built AI for business and government

Aurora Mobile’s GPTBots.ai to Integrate Zhipu AI’s Flagship GLM-4.5 Model to Enhance AI Capabilities

‘How Did We Get Here?’ AL + Electra Japonas Discuss – Artificial Lawyer

What's Hot

Paper page – Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model

Related Posts

Subscribe to Updates