Google (GOOGL) Unveils Gemini Robotics On-Device AI Model for...
Google (GOOGL, Financial) has recently launched the Gemini Robotics On-Device, a cutting-edge AI language model specifically designed to function without an internet connection on robotic devices. This innovative model is an extension of the previously released Gemini Robotics cloud version and is aimed at enhancing the control and responsiveness of robots.
Enhanced Human-Machine Interaction
Developers now have the ability to utilize natural language prompts to fine-tune the Gemini Robotics On-Device model, thereby improving the efficiency of human-machine interactions across a wide range of applications. In internal evaluations, this on-device model has demonstrated performance comparable to its cloud-based counterpart, showcasing superiority in various standard benchmarks when compared to other local AI language models.

Demonstrated Capabilities
Live demonstrations have illustrated robots equipped with the Gemini Robotics On-Device model successfully accomplishing tasks such as unzipping a backpack and folding clothes. Originally developed for ALOHA robots, this model has been adapted for use with other robotic platforms, including the Franka FR3 dual-arm robot and Apptronik's Apollo humanoid robot.

The Franka FR3 robot, in particular, has exhibited remarkable adaptability in unfamiliar environments, excelling in tasks such as assembling operations on industrial conveyor belts.
Supporting Developers
To facilitate developers in their endeavors, Google DeepMind has introduced the Gemini Robotics SDK. This software development kit enables developers to train robots in new tasks using 50-100 demonstration operations within the MuJoCo simulator, significantly accelerating the training and deployment processes.

Other prominent tech companies such as NVIDIA (NVDA) and Hugging Face are also actively integrating AI models with robotics. NVIDIA is focused on creating a foundational model platform for humanoid robots, while Hugging Face is dedicated to developing open-source language models and datasets for robotics applications.