Google's new Gemini Robotics On-Device AI model runs directly on robotic devices
Google's DeepMind division unveiled a new large language model on Tuesday, June 24, known as Gemini Robotics On-Device. This AI model is designed to run locally on robotic devices, providing enhanced capabilities. According to Google, the Gemini Robotics On-Device model has been optimized for efficient operation on robots, showcasing strong general-purpose dexterity and task generalization.
Offline Functionality and Features
This new AI model is an extension of Google's Gemini Robotics model, which was introduced earlier in March. The Gemini Robotics On-Device model enables the control of a robot's movements and the understanding of natural language prompts, similar to ChatGPT. Notably, this model functions without the need for an active internet connection, making it particularly valuable for latency-sensitive applications or environments with limited connectivity.

Enhanced Capabilities
Developed for robots equipped with two arms, the Gemini Robotics On-Device model is engineered to operate with minimal computational resources. It can perform highly dexterous tasks such as folding clothes and unzipping bags with precision. Google highlights that this model excels in completing complex multi-step instructions and handling challenging out-of-distribution tasks, surpassing other on-device alternatives.
Software Development Kit (SDK) Availability
Developers interested in exploring the capabilities of Gemini Robotics On-Device can access the model through the software development kit (SDK) provided by Google. This offers a hands-on experience with the AI model, allowing for experimentation and innovation in robotics applications.

While Google leads the way in AI models for robots, other tech giants like NVIDIA and Hugging Face are also actively involved in developing advanced solutions for humanoid robots. NVIDIA recently introduced Groot N1, an AI model tailored for humanoid robots, while Hugging Face is working on launching its robot powered by an in-house open-sourced model.
