Browsing: Hugging Face
As large language models (LLMs) advance in conversational and reasoning capabilities, their practical application in healthcare has become a critical…
Fine-grained object detection in challenging visual domains, such as vehicle damage assessment, presents a formidable challenge even for human experts…
Critic-free reinforcement learning methods, particularly group policies, have attracted considerable attention for their efficiency in complex tasks. However, these methods…
Recent advances in reasoning and planning capabilities of large language models (LLMs) have enabled their potential as autonomous agents capable…
Surface defect detection is a critical task across numerous industries, aimed at efficiently identifying and localising imperfections or irregularities on…
The increasing adoption of large language models (LLMs) in software engineering necessitates rigorous security evaluation of their generated code. However,…
The human ability to seamlessly perform multimodal reasoning and physical interaction in the open world is a core goal for…
Multimodal Large Language Models (MLLMs) equipped with step-by-step thinking capabilities have demonstrated remarkable performance on complex reasoning problems. However, this…
Audio-driven talking head synthesis has achieved remarkable photorealism, yet state-of-the-art (SOTA) models exhibit a critical failure: they lack generalization to…
GUI agent aims to enable automated operations on Mobile/PC devices, which is an important task toward achieving artificial general intelligence.…