The ChatGPT Revolution: OpenAI Introduces Advanced Voice Mode for Video Conversations

OpenAI launches an explosive new product: the ChatGPT voice feature

Seven months after its initial demonstration, OpenAI has unveiled a groundbreaking capability that can comprehend real-time videos. According to reports from Zhito Finance, nearly seven months following its first public showcase, OpenAI has officially rolled out a novel functionality for its paid users - the ChatGPT advanced AI voice assistant's new feature: real-time video conversations.

During a live stream on Thursday, OpenAI revealed that the Advanced Voice Mode, which emulates human conversation, is now a reality with support from OpenAI's multimodal model GPT-4o.

OpenAI announced the launch of the video and screen sharing features within the ChatGPT mobile application, enabling subscribers of ChatGPT Plus, Team, or Pro to interact with objects using their phones and receive near real-time responses from ChatGPT.

Advanced Voice Functionality

During the live broadcast, OpenAI researchers demonstrated the utilization of the new feature. Users can simply click on the voice icon next to the ChatGPT chat bar and then select the video icon in the lower left corner to initiate a video conversation. To engage in screen sharing, mobile users need to open a menu with three options and choose 'Share Screen'.

Advanced Voice has the capability to comprehend the content displayed on the device screen through screen sharing. For instance, it can provide explanations for various settings menus or offer suggestions for solving mathematical problems.

Rollout Plan

OpenAI announced that most subscribers to the ChatGPT Plus and Pro packages, along with all Team users, will gain access to the new features introduced on Thursday through the ChatGPT app within the next few days. Additionally, users in the EU, Swiss Franc, Iceland, Norway, and Liechtenstein are expected to access these features shortly. The enterprise and education versions of ChatGPT, Enterprise and Edu, are set to launch the new features in January next year.

Delayed Launch and Future Prospects

Advanced Voice experienced several delays, partly due to OpenAI's premature announcement of the feature before it was fully prepared for deployment. Initially promised for release within a few weeks back in April, OpenAI later extended the timeline, citing the need for additional development time.

In efforts to ensure the functionality can effectively handle user requests, OpenAI initially introduced the voice mode to a small group of Plus plan users at the end of June. The company then announced a one-month delay in release, aiming to guarantee the safety and efficiency of the function for a potentially large user base.

Competitors such as Google and Meta are also developing similar features for their chatbot products. Recently, Google launched Project Astra, a real-time video analysis conversational AI feature, catering to a select group of "trusted testers".