Browsing: Hugging Face
We introduce Robix, a unified model that integrates robot reasoning, task planning, and natural language interaction within a single vision-language…
Large language models (LLMs) are increasingly expected to go beyond simple factual queries toward Deep Research-tasks that require decomposing questions…
Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) have empowered large language models (LLMs) to tackle challenging reasoning tasks…
Medical image retrieval is essential for clinical decision-making and translational research, relying on discriminative visual representations. Yet, current methods remain…
Traditional alignment methods for Large Vision and Language Models (LVLMs) primarily rely on human-curated preference data. Human-generated preference data is…
Point cloud learning, especially in a self-supervised way without manual labels, has gained growing attention in both vision and learning…
Visual autoregressive models (VAR) have recently emerged as a promising class of generative models, achieving performance comparable to diffusion models…
Accurate segmentation of organs and tumors in CT and MRI scans is essential for diagnosis, treatment planning, and disease monitoring.…
GUI agents powered by LLMs show promise in interacting with diverse digital environments. Among these, video games offer a valuable…
Large Language Models (LLMs) excel at generating synthetic data, but ensuring its quality and diversity remains challenging. We propose Genetic…