Paper Page - Improving Editability In Image Generation With Layer-wise Memory

Most real-world image editing tasks require multiple sequential edits to
achieve desired results. Current editing approaches, primarily designed for
single-object modifications, struggle with sequential editing: especially with
maintaining previous edits along with adapting new objects naturally into the
existing content. These limitations significantly hinder complex editing
scenarios where multiple objects need to be modified while preserving their
contextual relationships. We address this fundamental challenge through two key
proposals: enabling rough mask inputs that preserve existing content while
naturally integrating new elements and supporting consistent editing across
multiple modifications. Our framework achieves this through layer-wise memory,
which stores latent representations and prompt embeddings from previous edits.
We propose Background Consistency Guidance that leverages memorized latents to
maintain scene coherence and Multi-Query Disentanglement in cross-attention
that ensures natural adaptation to existing content. To evaluate our method, we
present a new benchmark dataset incorporating semantic alignment metrics and
interactive editing scenarios. Through comprehensive experiments, we
demonstrate superior performance in iterative image editing tasks with minimal
user effort, requiring only rough masks while maintaining high-quality results
throughout multiple editing steps.

Source link

What's Hot

Thousands of Grok chats are now searchable on Google

PixVerse AI Effect Brings Oil Paintings to Life: Trending AI Video Generation Tool Analysis | AI News Detail

SoundHound AI, Cloudflare, C3.ai, Domo, and The Trade Desk Shares Plummet, What You Need To Know

Paper page – Improving Editability in Image Generation with Layer-wise Memory

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos – Takara TLDR

Prompt Orchestration Markup Language – Takara TLDR

Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer – Takara TLDR

Dallas Museum of Art Names Brian Ferriso as Its Next Director

Rapa Nui’s Moai Statues Threatened by Rising Sea Levels, Flooding

Mickalene Thomas Accused of Harassment by Racquel Chevremont

AI Impact on Art Galleries, and More Art News

Thousands of Grok chats are now searchable on Google

PixVerse AI Effect Brings Oil Paintings to Life: Trending AI Video Generation Tool Analysis | AI News Detail

SoundHound AI, Cloudflare, C3.ai, Domo, and The Trade Desk Shares Plummet, What You Need To Know

What's Hot

Paper page – Improving Editability in Image Generation with Layer-wise Memory

Related Posts

Subscribe to Updates