Browsing: Hugging Face
Natural Language to SQL (NL2SQL) enables intuitive interactions with databases by transforming natural language queries into structured SQL statements. Despite…
We propose a new problem, In-2-4D, for generative 4D (i.e., 3D + motion) inbetweening from a minimalistic input setting: two…
Building general-purpose models that can effectively perceive the world through multimodal signals has been a long-standing goal. Current approaches involve…
Existing approaches for controlling text-to-image diffusion models, while powerful, do not allow for explicit 3D object-centric control, such as precise…
Current monocular 3D detectors are held back by the limited diversity and scale of real-world datasets. While data augmentation certainly…
3D part amodal segmentation–decomposing a 3D shape into complete, semantically meaningful parts, even when occluded–is a challenging but crucial task…
Mixture-of-Experts (MoE) Large Language Models (LLMs) suffer from severely sub-optimal expert pathways-our study reveals that naive expert selection learned from…
Recent progress in diffusion models significantly advances various image generation tasks. However, the current mainstream approach remains focused on building…
In this paper, we present an effective method to enhance visual reasoning with significantly fewer training samples, relying purely on…
We present a novel, open-source social network simulation framework, MOSAIC, where generative language agents predict user behaviors such as liking,…