ChatGPT Strikes Emotional Chords: Crying Over Poetry

ChatGPT version "Her" goes viral: Crying while reciting poetry...

Just one day after its launch, GPT-4o’s advanced voice mode has become a huge hit. Countless netizens have run crazy tests, and GPT-4o has not only taken on all kinds of weird tasks but has also performed so well that many people exclaimed "Blow my mind".

For example, a netizen asked GPT-4o to tell a story in Chinese, and it performed like this: After watching it, many netizens who understand Chinese said that GPT-4o's performance was relatively OK in terms of both emotions and overall description. But it is not perfect. For example, its speaking speed is a little slow, and "qi" is pronounced as "kì". There are more human examples, listen carefully: You heard it right, when GPT-4o read the works of American poet Emiliy Dickinson, she cried! (There is a feeling that the deeper the love, the stronger it will be) The effect scared netizens, who described it as "creepy".

Crazy Testing by Netizens

However, this is just a small part of the crazy testing by netizens. There are many more interesting examples. Let’s continue reading.

If you were asked to count from 1 to 10 in English, the faster the better, how many seconds would it take you? Some netizens have made this request to GPT-4o to experience the speed of AI: When netizens asked it to read 1-10 at a faster speed, the "AI subtitle" recognition function failed. When GPT-4o is asked to speed read 1-50, we can also hear it taking deep breaths like a human.

Next, this netizen put forward a higher requirement - speed reading 1-100: Although it did not fully meet the netizens' requirements in the early stage, under his constant guidance, GPT-4o finally completed the task of speed reading 1-100. In addition, when it comes to being funny, GPT-4o is also good at learning to meow: (Let’s learn to meow together, meow meow meow~)

Of course, with the real-time and multi-language voice functions, netizens will certainly not miss the opportunity to test this task. The main feature is to interrupt and switch at will: Urdu → Hebrew → Norwegian → Moroccan Daliga → Amharic → Hungarian → Georgian → Klingon.

There are more practical functions. For example, if you are playing a Japanese game but can’t understand Japanese, just let GPT-4o help you: Wow, GPT-4o has transformed itself into a real-time translator.

Feedback from Professor Ethan Mollick

In addition to the specific cases mentioned above, Wharton School professor Ethan Mollick also talked about his feelings. He summarized GPT-4o’s advanced speech capabilities into three points:

It works just as well as OpenAI demonstrated at the time.
It's apparently capable of generating more audio, but there are limitations.
It's creepy. Lots of unconscious clues that make it feel like you're talking to a person.

Comparison with ChatGPT's Previous Capabilities

ChatGPT’s previous speech processing relied on three different models: first, a model that converts speech signals into text, second, GPT-4 for parsing and responding to user commands, and finally, a model that converts ChatGPT’s output text into speech. In contrast, GPT-4o has multimodal capabilities and can complete these tasks independently without the assistance of other models, which greatly reduces the waiting time during the conversation.

In addition, OpenAI also emphasized that GPT-4o can recognize and respond to emotional changes in user voices, such as being able to perceive emotions such as sadness and excitement.

Public Response and Future Expectations

As more and more netizens posted their test results, the onlookers below could no longer hold back and expressed their hope that OpenAI would allow more people to experience it as soon as possible. So what do you think are the more interesting ways to use GPT-4o's advanced voice functions? Welcome to leave a message in the comment area to discuss~

Reference Links: