Browsing: Hugging Face
VLMs are more vulnerable to harmful meme-based prompts than to synthetic images, and while multi-turn interactions offer some protection, significant…
A framework called QwenLong-L1 enhances large reasoning models for long-context reasoning through reinforcement learning, achieving leading performance on document question-answering…
The Transformer Copilot framework enhances large language model performance through a Copilot model that refines the Pilot’s logits based on…
Policy gradient algorithms have been successfully applied to enhance the reasoning capabilities of large language models (LLMs). Despite the widespread…
Orthogonal Residual Updates enhance feature learning and training stability by decomposing module outputs to contribute primarily novel features. Residual connections…
Synthetic Data RL enhances foundation models through reinforcement learning using only synthetic data, achieving performance comparable to models trained with…
Temporal reasoning is pivotal for Large Language Models (LLMs) to comprehend the real world. However, existing works neglect the real-world…
RIPT-VLA is a reinforcement learning-based interactive post-training paradigm that enhances pretrained Vision-Language-Action models using sparse binary success rewards, improving adaptability…
A scalable 3D shape generation framework using sparse volumes and spatial sparse attention, enabling high-resolution generation with reduced computational requirements.…
Did you know that fine-tuning retrievers & re-rankers on large but unclean training datasets can harm their performance? 😡 In…