USO: Unified Style And Subject-Driven Generation Via Disentangled And Reward Learning - Takara TLDR

Existing literature typically treats style-driven and subject-driven
generation as two disjoint tasks: the former prioritizes stylistic similarity,
whereas the latter insists on subject consistency, resulting in an apparent
antagonism. We argue that both objectives can be unified under a single
framework because they ultimately concern the disentanglement and
re-composition of content and style, a long-standing theme in style-driven
research. To this end, we present USO, a Unified Style-Subject Optimized
customization model. First, we construct a large-scale triplet dataset
consisting of content images, style images, and their corresponding stylized
content images. Second, we introduce a disentangled learning scheme that
simultaneously aligns style features and disentangles content from style
through two complementary objectives, style-alignment training and
content-style disentanglement training. Third, we incorporate a style
reward-learning paradigm denoted as SRL to further enhance the model’s
performance. Finally, we release USO-Bench, the first benchmark that jointly
evaluates style similarity and subject fidelity across multiple metrics.
Extensive experiments demonstrate that USO achieves state-of-the-art
performance among open-source models along both dimensions of subject
consistency and style similarity. Code and model:
https://github.com/bytedance/USO

Source link

What's Hot

Dynamic KV Cache Scheduling in Heterogeneous Memory Systems for LLM Inference (Rensselaer Polytechnic Institute, IBM)

Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection – Takara TLDR

Doubts on AI and Nvidia in an uncertain economy : NPR

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning – Takara TLDR

Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection – Takara TLDR

OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning – Takara TLDR

Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD – Takara TLDR

Woodmere Art Museum Sues Trump Administration Over Canceled IMLS Grant

Barbara Gladstone’s Chelsea Townhouse in NYC Sells for $13.1 M.

Trump Meets with Smithsonian Leader Amid Threats of Content Review

Australian School Faces Pushback over AI Art Course—and More Art News

Dynamic KV Cache Scheduling in Heterogeneous Memory Systems for LLM Inference (Rensselaer Polytechnic Institute, IBM)

Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection – Takara TLDR

Doubts on AI and Nvidia in an uncertain economy : NPR

What's Hot

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning – Takara TLDR

Related Posts

Subscribe to Updates