Based on the cloud-based Gemini Robotics, the new VLA model shows strong general-purpose dexterity and task generalisation, the company said. Since Gemini Robotics On-Device can be used to execute tasks without a data network, it can be helpful for activities that need quick responses to a command or in environments with intermittent or zero internet connectivity.
The model is meant for bi-arm robots engineered to run with minimal computational resources. Google showed a robot following voice commands for unzipping a bag, opening a container, and uncapping a marker while running the model locally.
While the VLA model was trained only for ALOHA robots, Google said it was further able to adapt it to a bi-arm Franka FR3 robot and the Apollo humanoid robot by Apptronik. The Franka robot was able to perform tasks like folding clothes or changing industrial belts while running the model.
On the Apollo humanoid robot, Google said it was able to adapt the model to follow natural language instructions and manipulate different objects, including previously unseen objects, in a general manner.
Google is sharing the Gemini Robotics software development kit (SDK) through its trusted tester programme to help developers easily evaluate Gemini Robotics On-Device on their tasks and environments. They can test the model in Google’s MuJoCo physics simulator and adapt it to new domains with as few as 50 to 100 demonstrations, the tech major said.