Browsing: Hugging Face
Agent systems powered by large language models (LLMs) have demonstrated impressive performance on repository-level code-generation tasks. However, for tasks such…
Image captioning is a fundamental task that bridges the visual and linguistic domains, playing a critical role in pre-training Large…
The growing capabilities of large language models and multimodal systems have spurred interest in voice-first AI assistants, yet existing benchmarks…
Accurate classification of products under the Harmonized Tariff Schedule (HTS) is a critical bottleneck in global trade, yet it has…
Protein folding models have achieved groundbreaking results typically via a combination of integrating domain knowledge into the architectural blocks and…
Large Language Models (LLMs) face significant computational challenges when processing long contexts due to the quadratic complexity of self-attention. While…
Despite steady progress in layout-to-image generation, current methods still struggle with layouts containing significant overlap between bounding boxes. We identify…
Recent advances in behavior cloning (BC) have enabled impressive visuomotor control policies. However, these approaches are limited by the quality…
We propose a framework that enables neural models to “think while listening” to everyday sounds, thereby enhancing audio classification performance.…
Reinforcement learning (RL) has shown promise in training agentic models that move beyond static benchmarks to engage in dynamic, multi-turn…