Google Photos introduces new AI-based feature
Google Research team has recently revealed the integration of Gemini-powered AI-based Ask Photos to Google Photos. This new feature, Ask Photos, is a robust example of how Gemini models can act as agents through memory abilities.
According to the Google Research team, Ask Photos will revolutionize the way users interact with their photos. Sample queries provided by Google include: 'Show me the best photo from each national park I’ve visited' and 'What themes have we had for Lena’s birthday parties'. These conversational queries are passed to an agent model that uses Gemini to determine the best retrieval augmented generation (RAG) tool for the task.
Typically, the agent model begins by understanding the user's main focus and composes a search via photos using an enhanced vector-based retrieval system. This system enhances the strong metadata search capabilities already built into Google Photos and is particularly adept at understanding basic language concepts.
The answer model then considers the search photos and videos, leveraging Gemini’s long context window and intermodal abilities to search for the most relevant information. It can extract visual content, text, dates, locations, and more to craft a constructive response grounded in videos and photos.
Ask Photos will not only assist users in finding information efficiently but will also remember the information for future conversations. This user-friendly search experience goes beyond a simple search feature, offering multiple ways to enhance the user experience.
For more information, read the original article Google I/O 2024 — AI takes centre stage with Gemini, Project Astra, and more.