TalkBack uses Gemini Nano to increase accessibility for Android users - Android Developers Blog
TalkBack is Android’s screen reader in the Android Accessibility Suite that describes text and images for Android users who have blindness or low vision. The TalkBack team is always working to make Android more accessible.
Advancing Accessibility with Gemini Nano
Today, thanks to Gemini Nano with multimodality, TalkBack automatically provides users with blindness or low vision more vivid and detailed image descriptions to better understand the images on their screen. This advancement is a core part of Google’s mission to build for everyone.
TalkBack now uses Gemini Nano’s new multimodal capabilities to automatically provide users with clear, detailed image descriptions in apps including Google Photos and Chrome, even if the device is offline or has an unstable network connection.
Enhanced Image Descriptions with Gemini Nano
Gemini Nano can provide detailed descriptions such as landmarks, products, and other specific details in images. For example, instead of a generic description like "Full moon over the ocean," Gemini Nano can provide a richer description like "A panoramic view of Sydney Opera House and the Sydney Harbour Bridge from the north shore of Sydney, New South Wales, Australia."
Using Gemini Nano on-device model has significantly improved the accessibility of image descriptions for TalkBack users, especially for the 90 unlabeled images they come across daily.
Hybrid AI Solution for Enhanced User Experience
TalkBack developers have also implemented a hybrid AI solution using Gemini 1.5 Flash to provide the best of on-device and server-based generative AI features. This allows users to get even more details about images by running them through Gemini Flash.
Recommendations for Developers
The Android accessibility team recommends developers to prototype and test the Gemini Nano with multimodality on a powerful, server-side model first. This approach can help developers understand the UX faster and ensure the highest quality possible.
While Gemini Nano can provide missing context to improve image descriptions, developers are advised to provide detailed alt text for all images on their apps or websites as best practice.
Conclusion
Leveraging Gemini Nano with multimodality to provide vivid and detailed image descriptions automatically is a significant step towards creating inclusive and accessible features for Android users. The hybrid approach towards AI, combining on-device processing with server-based AI, demonstrates Google's commitment to promoting inclusivity and accessibility.
Learn more about Gemini Nano for app development.