Paper Page - ILRM: An Iterative Large 3D Reconstruction Model

iLRM, an iterative Large 3D Reconstruction Model, improves scalability and efficiency in 3D reconstruction by decoupling scene representation, using a two-stage attention scheme, and injecting high-resolution information.

Feed-forward 3D modeling has emerged as a promising approach for rapid and
high-quality 3D reconstruction. In particular, directly generating explicit 3D
representations, such as 3D Gaussian splatting, has attracted significant
attention due to its fast and high-quality rendering, as well as numerous
applications. However, many state-of-the-art methods, primarily based on
transformer architectures, suffer from severe scalability issues because they
rely on full attention across image tokens from multiple input views, resulting
in prohibitive computational costs as the number of views or image resolution
increases. Toward a scalable and efficient feed-forward 3D reconstruction, we
introduce an iterative Large 3D Reconstruction Model (iLRM) that generates 3D
Gaussian representations through an iterative refinement mechanism, guided by
three core principles: (1) decoupling the scene representation from input-view
images to enable compact 3D representations; (2) decomposing fully-attentional
multi-view interactions into a two-stage attention scheme to reduce
computational costs; and (3) injecting high-resolution information at every
layer to achieve high-fidelity reconstruction. Experimental results on widely
used datasets, such as RE10K and DL3DV, demonstrate that iLRM outperforms
existing methods in both reconstruction quality and speed. Notably, iLRM
exhibits superior scalability, delivering significantly higher reconstruction
quality under comparable computational cost by efficiently leveraging a larger
number of input views.

Source link

What's Hot

How Tesla’s (TSLA) Robotaxi, AI Deals and U.K. Energy Push Could Shape Software Revenue Growth

InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles – Takara TLDR

Why AI Stocks Are Giving Some Investors Dotcom Bubble Déjà Vu

Paper page – iLRM: An Iterative Large 3D Reconstruction Model

InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles – Takara TLDR

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR – Takara TLDR

Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts – Takara TLDR

Mütter Museum in Philadelphia Announces New Policy for Human Remains

Inigo Philbrick, Art Dealer Convicted of Fraud, Appears in BBC Film

Links for August 22, 2025

White House Targets Specific Artworks at Smithsonian Museums

How Tesla’s (TSLA) Robotaxi, AI Deals and U.K. Energy Push Could Shape Software Revenue Growth

InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles – Takara TLDR

Why AI Stocks Are Giving Some Investors Dotcom Bubble Déjà Vu

What's Hot

Paper page – iLRM: An Iterative Large 3D Reconstruction Model

Related Posts

Subscribe to Updates