Paper Page - RelationAdapter: Learning And Transferring Visual Relation With Diffusion Transformers

Inspired by the in-context learning mechanism of large language models (LLMs), a new paradigm of generalizable visual prompt-based image editing is emerging. Existing single-reference methods typically focus on style or appearance adjustments and struggle with non-rigid transformations. To address these limitations, we propose leveraging source-target image pairs to extract and transfer content-aware editing intent to novel query images. To this end, we introduce RelationAdapter, a lightweight module that enables Diffusion Transformer (DiT) based models to effectively capture and apply visual transformations from minimal examples. We also introduce Relation252K, a comprehensive dataset comprising 218 diverse editing tasks, to evaluate model generalization and adaptability in visual prompt-driven scenarios. Experiments on Relation252K show that RelationAdapter significantly improves the model’s ability to understand and transfer editing intent, leading to notable gains in generation quality and overall editing performance.

Source link

What's Hot

AI Made Her a Better Mom. so She Vibe-Coded a Web App for Others.

Sam Altman says that bots are making social media feel ‘fake’

Skai uses Amazon Bedrock Agents to significantly improve customer insights by revolutionized data access and analysis

Paper page – RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Why Language Models Hallucinate – Takara TLDR

WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning – Takara TLDR

LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation – Takara TLDR

Storied Collector and MoMA Trustee Dies at 92

Congress Obtains Drawing Trump Apparently Made for Jeffrey Epstein

New Banksy Work at London’s Royal Courts Immediately Covered Up

John Pritzker Donates 188 Dada and Surrealist Works to the Met Museum

AI Made Her a Better Mom. so She Vibe-Coded a Web App for Others.

Sam Altman says that bots are making social media feel ‘fake’

Skai uses Amazon Bedrock Agents to significantly improve customer insights by revolutionized data access and analysis

What's Hot

Paper page – RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Related Posts

Subscribe to Updates