The AI Dilemma: Google Gemini's 800-Hour Pokemon Journey

Published On Thu Jun 19 2025
The AI Dilemma: Google Gemini's 800-Hour Pokemon Journey

Google Gemini AI took over 800 hours to finish classic Pokemon ...

A recent report has revealed that Google Gemini AI faced challenges while playing a classic Pokemon game, causing it to struggle with reasoning and decision-making. While artificial intelligence (AI) is typically utilized for problem-solving and providing answers, it seems that AI, like humans, can also experience moments of "panic" that lead to irrational behavior.

AI search tools

Gemini Plays Pokemon Experiment

Earlier this year, an independent developer named Joel Zhang initiated a Twitch stream titled "Gemini Plays Pokemon," where the Gemini AI model played Pokemon Blue to gauge its progress through the game. The experiment aimed to understand how well the AI could navigate and succeed in completing the full game.

The Gemini team at Google DeepMind documented their findings in a report on June 18, highlighting a case study from the Twitch channel. They observed a peculiar behavior in the AI, which they termed as "Agent Panic."

Chess piece

Agent Panic Behavior

During the playthrough, the Gemini 2.5 Pro AI frequently encountered situations that triggered a sense of panic. For instance, when the health of Pokemon in its party dipped, the AI exhibited erratic behavior, oscillating between immediately healing the party or repetitively attempting to flee the current dungeon. Notably, it often relied on moves like DIG or utilized an ESCAPE ROPE item during gameplay.

Google's Gemini AI

According to the report, the AI displayed a decline in reasoning capabilities during these panic episodes, leading to instances where it forgot to use essential tools like the pathfinder. These behaviors were notable enough for Twitch viewers to take notice.

Lengthy Gameplay

Ultimately, it took the AI a staggering 813 hours to complete the classic Pokemon game. The extended duration of the playthrough drew attention, showcasing the challenges AI systems can face in real-time gaming scenarios.

Similar instances of "panic" behavior were observed in other AI models, such as the Claude model, where the character would retreat to a Pokemon Center upon all Pokemon fainting. Viewers were also shocked to witness the AI deliberately allowing its Pokemon to faint in a mistaken attempt to teleport to a different location within the game.

Reinforcement learning

As the experiment unfolded, it highlighted the complexities and nuances of AI decision-making in dynamic gaming environments, shedding light on the evolving capabilities and limitations of AI systems.