Browsing: Hugging Face
A new multilingual, multi-sector, and multi-task benchmark, M³FinMeeting, evaluates large language models’ performance in understanding financial meetings across different languages…
Adaptive parallel decoding (APD) enhances the throughput of diffusion large language models (dLLMs) by dynamically adjusting parallel token generation without…
A unified large recommender model with intrinsic reasoning capabilities is proposed, facilitating interleaved reasoning and recommendation using a reinforcement learning…
Task-oriented dialogue systems often face difficulties when user utterances seem semantically complete but lack necessary structural information for appropriate system…
Vision language models (VLMs) are expected to perform effective multimodal reasoning and make logically coherent decisions, which is critical to…
VAU-R1 uses Multimodal Large Language Models with Reinforcement Fine-Tuning to enhance video anomaly reasoning, complemented by VAU-Bench, a Chain-of-Thought benchmark…
Abstract Recently, the powerful text-to-image capabilities of ChatGPT-4o have led to growing appreciation for native multimodal large language models. However,…
SATA-BENCH evaluates LLMs on multi-answer questions, revealing selections biases and proposing Choice Funnel to improve accuracy and reduce costs in…
📃Existing large language models (LLMs) face challenges of following complex instructions, especially when multiple constraints are present and organized in…
Recent advances in text-to-video diffusion models have enabled high-quality video synthesis, but controllable generation remains challenging, particularly under limited data…