Paper Page - SphereDiff: Tuning-free Omnidirectional Panoramic Image And Video Generation Via Spherical Latent Representation

The increasing demand for AR/VR applications has highlighted the need for
high-quality 360-degree panoramic content. However, generating high-quality
360-degree panoramic images and videos remains a challenging task due to the
severe distortions introduced by equirectangular projection (ERP). Existing
approaches either fine-tune pretrained diffusion models on limited ERP datasets
or attempt tuning-free methods that still rely on ERP latent representations,
leading to discontinuities near the poles. In this paper, we introduce
SphereDiff, a novel approach for seamless 360-degree panoramic image and video
generation using state-of-the-art diffusion models without additional tuning.
We define a spherical latent representation that ensures uniform distribution
across all perspectives, mitigating the distortions inherent in ERP. We extend
MultiDiffusion to spherical latent space and propose a spherical latent
sampling method to enable direct use of pretrained diffusion models. Moreover,
we introduce distortion-aware weighted averaging to further improve the
generation quality in the projection process. Our method outperforms existing
approaches in generating 360-degree panoramic content while maintaining high
fidelity, making it a robust solution for immersive AR/VR applications. The
code is available here. https://github.com/pmh9960/SphereDiff

Source link

What's Hot

Google DeepMind updates Frontier Safety Framework for AI model risks

Rocket.new, one of India’s first vibe-coding startups, snags $15M from Accel, Salesforce Ventures

Altman, Huang negotiations that sealed $100 billion OpenAI-Nvidia deal

Paper page – SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation

MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction – Takara TLDR

OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System – Takara TLDR

SPATIALGEN: Layout-guided 3D Indoor Scene Generation – Takara TLDR

Court Rules ‘Gender Ideology’ Ban on Art Endowments Unconstitutional

St. Patrick’s Cathedral Unveils Monumental Mural by Adam Cvijanovic

Three Loaned Banksy Works Incite Dispute Between England and Italy

Major Collection of Old Masters Paintings Could Be Fractionalized

Google DeepMind updates Frontier Safety Framework for AI model risks

Rocket.new, one of India’s first vibe-coding startups, snags $15M from Accel, Salesforce Ventures

Altman, Huang negotiations that sealed $100 billion OpenAI-Nvidia deal

What's Hot

Paper page – SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation

Related Posts

Subscribe to Updates