Paper Page - LumosFlow: Motion-Guided Long Video Generation

LumosFlow uses LMTV-DM for key frame generation and LOF-DM followed by MotionControlNet for smooth intermediate frame interpolation, ensuring temporally coherent long video generation.

Long video generation has gained increasing attention due to its widespread
applications in fields such as entertainment and simulation. Despite advances,
synthesizing temporally coherent and visually compelling long sequences remains
a formidable challenge. Conventional approaches often synthesize long videos by
sequentially generating and concatenating short clips, or generating key frames
and then interpolate the intermediate frames in a hierarchical manner. However,
both of them still remain significant challenges, leading to issues such as
temporal repetition or unnatural transitions. In this paper, we revisit the
hierarchical long video generation pipeline and introduce LumosFlow, a
framework introduce motion guidance explicitly. Specifically, we first employ
the Large Motion Text-to-Video Diffusion Model (LMTV-DM) to generate key frames
with larger motion intervals, thereby ensuring content diversity in the
generated long videos. Given the complexity of interpolating contextual
transitions between key frames, we further decompose the intermediate frame
interpolation into motion generation and post-hoc refinement. For each pair of
key frames, the Latent Optical Flow Diffusion Model (LOF-DM) synthesizes
complex and large-motion optical flows, while MotionControlNet subsequently
refines the warped results to enhance quality and guide intermediate frame
generation. Compared with traditional video frame interpolation, we achieve 15x
interpolation, ensuring reasonable and continuous motion between adjacent
frames. Experiments show that our method can generate long videos with
consistent motion and appearance. Code and models will be made publicly
available upon acceptance. Our project page:
https://jiahaochen1.github.io/LumosFlow/

Source link

What's Hot

Amazon Music’s new AI feature generates personalized playlists every Monday

AI Made Her a Better Mom. so She Vibe-Coded a Web App for Others.

Sam Altman says that bots are making social media feel ‘fake’

Paper page – LumosFlow: Motion-Guided Long Video Generation

Why Language Models Hallucinate – Takara TLDR

WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning – Takara TLDR

LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation – Takara TLDR

Storied Collector and MoMA Trustee Dies at 92

Congress Obtains Drawing Trump Apparently Made for Jeffrey Epstein

New Banksy Work at London’s Royal Courts Immediately Covered Up

John Pritzker Donates 188 Dada and Surrealist Works to the Met Museum

Amazon Music’s new AI feature generates personalized playlists every Monday

AI Made Her a Better Mom. so She Vibe-Coded a Web App for Others.

Sam Altman says that bots are making social media feel ‘fake’

What's Hot

Paper page – LumosFlow: Motion-Guided Long Video Generation

Related Posts

Subscribe to Updates