Browsing: Hugging Face
Recent advances in reinforcement learning (RL) have significantly enhanced the agentic capabilities of large language models (LLMs). In long-term and…
Recent advances in 3D-native generative models have accelerated asset creation for games, film, and design. However, most methods still rely…
Large multimodal reasoning models have achieved rapid progress, but their advancement is constrained by two major limitations: the absence of…
Image composition aims to seamlessly insert a user-specified object into a new scene, but existing models struggle with complex lighting…
Traditional recommender systems rely on passive feedback mechanisms that limit users to simple choices such as like and dislike. However,…
We present SD3.5-Flash, an efficient few-step distillation framework that brings high-quality image generation to accessible consumer devices. Our approach distills…
We present a scientific reasoning foundation model that aligns natural language with heterogeneous scientific representations. The model is pretrained on…
Robotic manipulation policies often fail to generalize because they must simultaneously learn where to attend, what actions to take, and…
Visual Spatial Reasoning (VSR) is a core human cognitive ability and a critical requirement for advancing embodied intelligence and autonomous…
The use of continuous instead of discrete tokens during the Chain-of-Thought (CoT) phase of reasoning LLMs has garnered attention recently,…