Browsing: Hugging Face
Despite the existing evolution of Multimodal Large Language Models (MLLMs), a non-neglectable limitation remains in their struggle with visual text…
We find that the response length of reasoning LLMs, whether trained by reinforcement learning or supervised learning, drastically increases for…
We release OLMoTrace, a tool that lets you trace the outputs of language models back to their full, multi-trillion-token training…
Reasoning has emerged as the next major frontier for language models (LMs), with rapid advances from both academic and industrial…
Creating a realistic animatable avatar from a single static portrait remains challenging. Existing approaches often struggle to capture subtle facial…
Excited to present KUMO, a generative evaluation benchmark for LLMs. Unlike static benchmarks, KUMO dynamically generates diverse, multi-turn reasoning tasks…
Balancing fidelity and editability is essential in text-based image editing (TIE), where failures commonly lead to over- or under-editing issues.…
Existing reasoning evaluation frameworks for Large Language Models (LLMs) and Large Vision-Language Models (LVLMs) predominantly either assess text-based reasoning or…
Diffusion models approximate the denoising distribution as a Gaussian and predict its mean, whereas flow matching models reparameterize the Gaussian…
The proliferation of Large Language Models (LLMs) accessed via black-box APIs introduces a significant trust challenge: users pay for services…