Paper page - A Survey of Interactive Generative Video

Interactive Generative Video (IGV) has emerged as a crucial technology in
response to the growing demand for high-quality, interactive video content
across various domains. In this paper, we define IGV as a technology that
combines generative capabilities to produce diverse high-quality video content
with interactive features that enable user engagement through control signals
and responsive feedback. We survey the current landscape of IGV applications,
focusing on three major domains: 1) gaming, where IGV enables infinite
exploration in virtual worlds; 2) embodied AI, where IGV serves as a
physics-aware environment synthesizer for training agents in multimodal
interaction with dynamically evolving scenes; and 3) autonomous driving, where
IGV provides closed-loop simulation capabilities for safety-critical testing
and validation. To guide future development, we propose a comprehensive
framework that decomposes an ideal IGV system into five essential modules:
Generation, Control, Memory, Dynamics, and Intelligence. Furthermore, we
systematically analyze the technical challenges and future directions in
realizing each component for an ideal IGV system, such as achieving real-time
generation, enabling open-domain control, maintaining long-term coherence,
simulating accurate physics, and integrating causal reasoning. We believe that
this systematic analysis will facilitate future research and development in the
field of IGV, ultimately advancing the technology toward more sophisticated and
practical applications.

Source link

What's Hot

Paper page – Energy-Based Transformers are Scalable Learners and Thinkers

OpenAI warns staff to ignore Meta’s ‘ridiculous’ offers as poaching battle escalates

Randomness and Bell’s Inequality [Audio only] | Two Minute Papers #31

Paper page – A Survey of Interactive Generative Video

Paper page – Energy-Based Transformers are Scalable Learners and Thinkers

Paper page – Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models

Paper page – ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Albright College is Selling Its Art Collection to Balance Its Books

Big Three Auction Houses Hold Old Masters Sales in London This Week

MFA Boston Returns Two Works to Kingdom of Benin

Tate’s £150M Endowment Campaign May Include Turbine Hall Naming Rights

Paper page – Energy-Based Transformers are Scalable Learners and Thinkers

OpenAI warns staff to ignore Meta’s ‘ridiculous’ offers as poaching battle escalates

Randomness and Bell’s Inequality [Audio only] | Two Minute Papers #31

What's Hot

Paper page – A Survey of Interactive Generative Video

Related Posts

Subscribe to Updates