Browsing: Hugging Face
Reinforcement learning exhibits potential in enhancing the reasoning abilities of large language models, yet it is hard to scale for…
Prioritizing feature consistency in sparse autoencoders improves mechanistic interpretability of neural networks by ensuring reliable and interpretable features. Sparse Autoencoders…
Multi-Turn Decomposition improves efficiency in large reasoning models by breaking down chain-of-thought into manageable turns, reducing token usage and latency…
This study presents a comprehensive analysis of two emerging paradigms in AI-assisted software development: vibe coding and agentic coding. While…
DoctorAgent-RL, a reinforcement learning-based multi-agent framework, enhances multi-turn reasoning and diagnostic performance in medical consultations compared to existing systems. Large…
A reinforcement learning-guided training paradigm enhances large language models’ reasoning efficiency and performance for multi-hop questions by interleaving thinking and…
PathFinder-PRM, a hierarchical and error-aware Process Reward Model, improves mathematical problem-solving by fine-grained error classification and step correctness estimation, achieving…
A survey proposes a systematic taxonomy for evaluating large audio-language models across dimensions including auditory awareness, knowledge reasoning, dialogue ability,…
Foundation models are increasingly becoming better autonomous programmers, raising the prospect that they could also automate dangerous offensive cyber-operations. Current…
DC-CoT provides a comprehensive benchmark for assessing data-centric distillation techniques in chain-of-thought distillation, focusing on performance and generalization across different…