Browsing: Hugging Face
Reinforcement learning with verifiable rewards (RLVR) has shown promise in enhancing the reasoning capabilities of large language models by learning…
Action customization involves generating videos where the subject performs actions dictated by input control signals. Current methods use pose-guided or…
Large Language Models (LLMs) have demonstrated unprecedented capabilities across various natural language processing tasks. Their ability to process and generate…
Recent advancements in AI-driven soccer understanding have demonstrated rapid progress, yet existing research predominantly focuses on isolated or narrow tasks.…
✨ Highlights Low Latency. VITA-Audio is the first end-to-end speech model capable of generating audio during the initial forward pass.…
The rapid advancement of diffusion models holds the promise of revolutionizing the application of VR and AR technologies, which typically…
In recent years, multi-agent frameworks powered by large language models (LLMs) have advanced rapidly. Despite this progress, there is still…
The collaborative paradigm of large and small language models (LMs) effectively balances performance and cost, yet its pivotal challenge lies…
Traditional data presentations typically separate the presenter and visualization into two separate spaces–the 3D world and a 2D screen–enforcing visualization-centric…
Transformers have achieved great success in numerous NLP tasks but continue to exhibit notable gaps in multi-step factual reasoning, especially…