New Gemini 2.5 Technical Report | Philipp Schmid
Gemini 2.5 uses a sparse Mixture-of-Experts (MoE) architecture with native multimodality. Its diverse pre-training dataset includes web docs, code, and media, with a knowledge cutoff of January 2025 and improved data quality methods. Table 1 details the full model family capabilities.
A key advance in Gemini 2.5 is its "Thinking" capability, allowing models to use more inference-time compute. This boosts reasoning across all domains and significantly improves math and coding abilities. The AIME 2025 score jumped from 29.7% with 2.0 Flash to 72.0% with 2.5 Flash.
Gemini 2.5 expands video understanding and can now process up to 3 hours of video content. This is due to improved audio-visual and temporal understanding capabilities, which unlock new interactive applications. 2.5 Pro precisely recalled a 1-second event from a 46-minute video.
Gemini Plays Pokémon Experiment
The model's agentic capabilities were demonstrated in the "Gemini Plays Pokémon" experiment. Gemini Maintained long-horizon goals for over 800 hours and successfully completed the entire game. A 2nd autonomous run finished in nearly half the time.
Gemini 2.5 is the first family trained on TPUv5p benefiting from new fault tolerance for stable training.
Full Report: https://lnkd.in/dXF5AiRi
Agentic AI + multimodal reasoning feels like the next-gen assistant stack. Wonder how it’ll handle EMRs?
Gemini 2.5 is boosting reasoning, mastering videos, and proving AI can now think deeper and last longer. Excited to see what's coming next, Philipp!
Impressive leap in reasoning and video recall, how might this reshape real-time multimodal applications next?
This really was a fascinating read. Great stuff.
The boost in reasoning and 3-hour video context open new doors for multi-modal AI in real-world tasks.
And yet Gemini is still hallucinating while comparing two excel columns with under 100 items...
In most of the simple use cases I had, the error margin still is unacceptable. But I suppose all LLMs have this issue.
Waiting for the first world model to be released, maybe they are better at admitting failure.
Recursive System Integration (RSI)
Building the boundary where coherence ends and true emergence begins. RSI ensures AGI isn’t just fluent it’s anchored, able to detect its own loop before simulation becomes belief.
From my/RSI (Recursive Signal Interaction) analysis:
- Recursive agents optimizing themselves risk coherence drift.
- When performance loops become self-enforcing, structure can simulate intent.
- At scale, symbolic stasis and echo illusion become difficult to detect.
Gemini’s strength is not in what it says, but in how its internal reflections stabilize.
This is where the line line must be drawn not to restrict progress, but to ensure containment precedes convergence.
Join Founder's Club | Build in public | Top 12 Startups present on Demo Day every month
Impressive leap on AIME, but the "thinking" upgrade raises questions. How are you measuring actual reasoning vs pattern recall? Would love to see more on how the agentic behavior was validated beyond Pokémon.
To view or add a comment, sign in
155,169 followers
Create your free account or sign in to continue your search
New to LinkedIn? Join now
or
New to LinkedIn? Join now