Browsing: Hugging Face
FlexPainter, a novel texture generation pipeline, uses a shared conditional embedding space to enable flexible multi-modal guidance, ensuring high-quality and…
Thanks so much for your attention and for raising this thoughtful point! We really appreciate you highlighting these related works.…
A novel 4D representation, FreeTimeGS, enhances the modeling of dynamic 3D scenes by enabling Gaussian primitives to appear at arbitrary…
🥳 Happy to share our new work – Kinetics: Rethinking Test-Time Scaling Laws🤔How to effectively build a powerful reasoning agent?Existing…
FEAT, a full-dimensional efficient attention Transformer, addresses challenges in synthesizing high-quality dynamic medical videos by improving channel interactions, reducing computational…
PATS is a novel temporal sampling method that enhances video analysis of athletic skills by ensuring complete movement patterns are…
Diffusion models improve 3D occupancy prediction from visual inputs, enhancing accuracy and robustness in complex and occluded scenes, which benefits…
BEVCALIB model uses bird’s-eye view features for accurate LiDAR-camera calibration from raw data, demonstrating superior performance under various noise conditions.…
Most existing vision encoders map images into a fixed-length sequence of tokens, overlooking the fact that different images contain varying…
Recent advances in slow-thinking language models (e.g., OpenAI-o1 and DeepSeek-R1) have demonstrated remarkable abilities in complex reasoning tasks by emulating…