Google's Most Advanced AI Coming To Search, Cloud And More 12 ...
Google Gemini launched Wednesday, introducing the company’s most powerful artificial intelligence large language model (LLM). This AI model is multimodal, capable of processing text, images, audio, video, and code, making it highly versatile.
The company has already begun implementing Gemini in Google Search, specifically in the Google Search Generative Experience (SGE), resulting in a 40% reduction in latency for English users in the U.S. This technology will also extend to other services such as Ads, Chrome, and Duet AI.
Gemini Pro and Its Availability
Starting December 13, developers and enterprise customers will be able to access Gemini Pro through the Gemini API in Google AI Studio or Google Cloud Vertex AI. This signifies a significant advancement in AI technology.
Google intends to offer licenses for Gemini to customers through Google Cloud, enabling them to incorporate this advanced AI model into their own applications.
Gemini's Performance
Gemini was trained using a new generation of powerful cloud-based processors, enabling faster training of large AI models compared to previous versions. This advancement is poised to benefit the AI industry and advertisers, making AI training more efficient and accessible.
Gemini Ultra, boasting an impressive 90.0% score, surpasses human experts in massive multitask language understanding. This model covers a wide array of subjects such as math, physics, history, law, medicine, and ethics, showcasing its diverse capabilities.
Future Developments
Google plans to launch Bard Advanced early next year, offering users access to its top AI models and capabilities, starting with Gemini Ultra. Additionally, Gemini’s advanced multimodal reasoning abilities will help in processing complex written and visual information effectively.
Despite its groundbreaking features, one of the main challenges Google faced in launching Gemini was ensuring that its multimodal technology remains safe and non-offensive, given its capability to process multiple inputs simultaneously, such as text and images.