10 Brilliant Ways to Make Google's Gemini Think Smarter

Published On Sat Apr 19 2025

You Can Give the Latest Version of Google's Gemini a 'Thinking ...

Google just rolled out an upgraded version of its latest AI model, with a new feature letting you "turn thinking on or off." On Thursday, the tech giant rolled out an early version of Gemini 2.5 Flash, an updated version of the 2.5 model it released in March.

However, Google is now ready to let you choose how much this new model thinks. And if you really want to, you can tell it to stop thinking completely. In a blog post, Google Gemini's director of product management, Tulsee Doshi, said that developers can "set thinking budgets to find the right tradeoff between quality, cost, and latency."

Gemini 2.5 Flash and Pro, Live API, and Veo 2 in the Gemini API

The new feature aims to address the intense processing and computing requirements of a new wave of "reasoning" models that have spurred interest across the AI industry.

Efficient Computing Power

Google's new model aims to ensure that its reasoning model uses only as much processing power as necessary and applies it only when needed. Doshi noted that not all tasks require the same reasoning.

The bigger-is-better approach to AI is running out of road

To allocate different levels of reasoning abilities to user queries, Google will allow developers to set a "thinking budget" that Doshi said will offer "fine-grained control" over the number of tokens — units of data — a model generates while operating.

The move to introduce a "thinking budget" also follows a wider shift in the industry to become more "efficient" in the use of computing power. This followed the release of a reasoning model in January from Chinese startup DeepSeek that claimed to use less computing power.

Check out the preview of Gemini 2.5 Flash on https://t.co/lLpF8ToTVJ for another Gemini data point on the cost-performance pareto frontier!