Browsing: Hugging Face
This paper introduces ByteWrist, a novel highly-flexible and anthropomorphic parallel wrist for robotic manipulation. ByteWrist addresses the critical limitations of…
Universal multimodal embedding models have achieved great success in capturing semantic relevance between queries and candidates. However, current methods either…
Despite the growing interest in replicating the scaled success of large language models (LLMs) in industrial search and recommender systems,…
Creating high-fidelity 3D models of indoor environments is essential for applications in design, virtual reality, and robotics. However, manual 3D…
When evaluating large language models (LLMs) with multiple-choice question answering (MCQA), it is common to end the prompt with the…
The ultimate goal of embodied agents is to create collaborators that can interact with humans, not mere executors that passively…
Although COLMAP has long remained the predominant method for camera parameter optimization in static scenes, it is constrained by its…
In the field of AI-driven human-GUI interaction automation, while rapid advances in multimodal large language models and reinforcement fine-tuning techniques…
Generative modeling, representation learning, and classification are three core problems in machine learning (ML), yet their state-of-the-art (SoTA) solutions remain…
Robotic real-world reinforcement learning (RL) with vision-language-action (VLA) models is bottlenecked by sparse, handcrafted rewards and inefficient exploration. We introduce…