Paper Page - DMM: Building A Versatile Image Generation Model Via Distillation-Based Model Merging

The success of text-to-image (T2I) generation models has spurred a
proliferation of numerous model checkpoints fine-tuned from the same base model
on various specialized datasets. This overwhelming specialized model production
introduces new challenges for high parameter redundancy and huge storage cost,
thereby necessitating the development of effective methods to consolidate and
unify the capabilities of diverse powerful models into a single one. A common
practice in model merging adopts static linear interpolation in the parameter
space to achieve the goal of style mixing. However, it neglects the features of
T2I generation task that numerous distinct models cover sundry styles which may
lead to incompatibility and confusion in the merged model. To address this
issue, we introduce a style-promptable image generation pipeline which can
accurately generate arbitrary-style images under the control of style vectors.
Based on this design, we propose the score distillation based model merging
paradigm (DMM), compressing multiple models into a single versatile T2I model.
Moreover, we rethink and reformulate the model merging task in the context of
T2I generation, by presenting new merging goals and evaluation protocols. Our
experiments demonstrate that DMM can compactly reorganize the knowledge from
multiple teacher models and achieve controllable arbitrary-style generation.

Source link

What's Hot

GIST and MIT Launch Full-Scale Research on Human-Centered Physical AI Interaction

Distyl AI Raises $175M Series B At $1.8B Valuation, Up 9x From Last Funding

The Oakland Ballers let an AI manage the team. What could go wrong?

Paper page – DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

Ask-to-Clarify: Resolving Instruction Ambiguity through Multi-turn Dialogue – Takara TLDR

RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes – Takara TLDR

BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent – Takara TLDR

St. Patrick’s Cathedral Unveils Monumental Mural by Adam Cvijanovic

Three Loaned Banksy Works Incite Dispute Between England and Italy

Major Collection of Old Masters Paintings Could Be Fractionalized

100 Must-See Artworks at the Metropolitan Museum of Art

GIST and MIT Launch Full-Scale Research on Human-Centered Physical AI Interaction

Distyl AI Raises $175M Series B At $1.8B Valuation, Up 9x From Last Funding

The Oakland Ballers let an AI manage the team. What could go wrong?

What's Hot

Paper page – DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

Related Posts

Subscribe to Updates