Browsing: Hugging Face
Prolonged reinforcement learning training (ProRL) uncovers novel reasoning strategies in language models, outperforming base models and suggesting meaningful expansion of…
See examples and results at: https://leililab.github.io/HardTests/ RLVR is not just about RL, it’s more about VR! Particularly for LLM coding,…
CAPTCHAs have been a critical bottleneck for deploying web agents in real-world applications, often blocking them from completing end-to-end automation…
Vision language models exhibit strong biases in counting and identification tasks, demonstrating a failure mode that persist even with additional…
A study reveals that Large Language Models (LLMs) struggle with expressing uncertainty accurately and introduces MetaFaith, a prompt-based method that…
We present v1, a lightweight extension to Multimodal Large Language Models (MLLMs) that enables selective visual revisitation during inference. While…
A comprehensive TTS benchmark, EmergentTTS-Eval, automates test-case generation and evaluation using LLMs and LALM to assess nuanced and semantically complex…
LLMs are nonlinear functions that map a sequence of input embedding vectors to a predicted embedding vector. We show that…
CLIPGaussians is a style transfer framework that supports text- and image-guided stylization of 2D images, videos, 3D objects, and 4D…
Evaluation of Large Reasoning Models in multilingual reasoning shows limited capability, with interventions improving readability but reducing accuracy. Recent Large…