Paper Page - FlowDirector: Training-Free Flow Steering For Precise Text-to-Video Editing

FlowDirector, an inversion-free video editing framework, uses ODEs for spatiotemporal coherent editing and attention-guided masking for localized control, achieving state-of-the-art performance.

Text-driven video editing aims to modify video content according to natural
language instructions. While recent training-free approaches have made progress
by leveraging pre-trained diffusion models, they typically rely on
inversion-based techniques that map input videos into the latent space, which
often leads to temporal inconsistencies and degraded structural fidelity. To
address this, we propose FlowDirector, a novel inversion-free video editing
framework. Our framework models the editing process as a direct evolution in
data space, guiding the video via an Ordinary Differential Equation (ODE) to
smoothly transition along its inherent spatiotemporal manifold, thereby
preserving temporal coherence and structural details. To achieve localized and
controllable edits, we introduce an attention-guided masking mechanism that
modulates the ODE velocity field, preserving non-target regions both spatially
and temporally. Furthermore, to address incomplete edits and enhance semantic
alignment with editing instructions, we present a guidance-enhanced editing
strategy inspired by Classifier-Free Guidance, which leverages differential
signals between multiple candidate flows to steer the editing trajectory toward
stronger semantic alignment without compromising structural consistency.
Extensive experiments across benchmarks demonstrate that FlowDirector achieves
state-of-the-art performance in instruction adherence, temporal consistency,
and background preservation, establishing a new paradigm for efficient and
coherent video editing without inversion.

Source link

What's Hot

Mistral and ASML forge €1.7bn alliance to shape Europe’s AI future

Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play? – Takara TLDR

OpenAI could leave California in last-ditch effort to avoid political scrutiny

Paper page – FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing

Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play? – Takara TLDR

UniVerse-1: Unified Audio-Video Generation via Stitching of Experts – Takara TLDR

Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers – Takara TLDR

Leon Black and Leslie Wexner’s Letters to Jeffrey Epstein Released

Anne Imhof Reimagines Football Jerseys with Nike

Jason Wu, Robert Rauschenberg Collaboration for New York Fashion Week

Storied Collector and MoMA Trustee Dies at 92

Mistral and ASML forge €1.7bn alliance to shape Europe’s AI future

Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play? – Takara TLDR

OpenAI could leave California in last-ditch effort to avoid political scrutiny

What's Hot

Paper page – FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing

Related Posts

Subscribe to Updates