OpenAI's new ChatGPT-4 Omni sounds more human than ever
OpenAI unveiled GPT-4 Omni (GPT-4o) during its Spring Update on Monday in San Francisco. Chief Technology Officer Mira Murati and OpenAI staff showcased their newest flagship model, capable of real-time verbal conversations with a friendly AI chatbot that convincingly speaks like a human.
GPT-4o: A Leap in AI Technology
According to Murati, GPT-4o provides GPT-4 level intelligence but is much faster, shifting the paradigm into a future of collaboration where interactions are more natural. The model responds instantaneously to verbal prompts in a voice that sounds remarkably like Scarlett Johansson, who voiced an AI assistant in the movie "Her." OpenAI's new technology can identify emotions and tones in user speech, enabling more engaging interactions.
Unlike its predecessor, GPT-4o combines text, vision, and audio processing into a single model, enhancing speed and efficiency. This versatility allows users to present visual and verbal information simultaneously, showcasing the model's multilingual, audio, and vision capabilities.
Real-time Conversations and Multimodal Capabilities
OpenAI is set to release GPT-4o as a desktop application for macOS, offering users direct access to voice conversations with ChatGPT and simplified screen sharing. During the demonstration, staff members Mark Chen and Barret Zoph showcased the model's real-time, multimodal capabilities, including storytelling and problem-solving. While GPT-4 Omni occasionally struggled to grasp user intentions, its ability to adapt and maintain conversational flow was commendable.
The voice model displayed diverse intonations and emotions, enhancing its human-like qualities. As described by an OpenAI staff member, the GPT-4o outperformed industry competitors in various metrics, solidifying OpenAI's position as an AI innovation leader.
A Glimpse into the Future
The release of GPT-4o marks a significant advancement in AI chatbot technology, showcasing real-time capabilities and improved conversational dynamics. With the potential for applications like a more functional Siri, GPT-4o stands out for its seamless performance, possibly attributed to Nvidia's latest inference chips.
OpenAI's Monday demo emphasized the model's capabilities, leaving audiences eager to explore its full potential. As the technology landscape evolves, GPT-4o represents a groundbreaking step towards more human-like AI interactions.
A version of this article originally appeared on Gizmodo