, an experimental model designed to enhance AI reasoning capabilities. This mode can "explicitly demonstrate its thought process," providing users with stronger reasoning abilities compared to the basic Gemini 2.0 Flash model.
o1 and o1 Pro answer the same question but take longer to think. (Flash Thinking took 14.3 seconds, o1 took 1 minute and 42 seconds, o1 Pro took 2 minutes and 18 seconds.)
Characteristics of the Gemini 2.0 Flash Thinking Mode
.
Functionality of the Thoughts Panel
: The Thoughts panel is collapsed by default and can be expanded by clicking on the Thoughts title bar. : Unlike the main response returned, the content in the Thoughts panel cannot be edited in Google AI Studio.
Users can access and experience this feature within Google AI Studio, gaining an intuitive understanding of the logical reasoning process used by the model when generating responses.
Functional Limitations
Currently, as an experimental model, Flash Thinking Mode still has some limitations:
: Supports 32k token input and accepts only text and image formats. : Supports up to 8k token output and only supports text output. : This mode does not support the use of built-in tools such as search functionality or code execution.
Chatbot Arena LLM Leaderboard
Surged to first place..
The leap from Gemini-2.0-Flash:
Overall: #3 → #1 Overall (Style Control): #4 → #1 Math: #2 → #1 Creative Writing: #2 → #1 Hard Prompts: #1 → #1 (+14 pts) Vision: #1 → #1 (+16 pts)

ChatGPT o1 / ChatGPT o1 Pro / Gemini-2.0-Flash Comparative Analysis
Challenge 1: Solving the Zebra Problem
Model | Performance |
---|---|
ChatGPT o1 | ✅ |
ChatGPT o1 Pro | ❌ |
Gemini-2.0-Flash-Thinking | ❌ |
Solve this Zebra Puzzle:
I did it myself first using human intelligence... This is the standard answer
o1
Thought process
Output answer
Correct answer!
o1 Pro
Thought process
Output answer
Evaluation:He knows his answer is problematic, but he thinks there is a typo in the question.This attribution method of primarily finding external problems indeed resembles our mediocre humans.
Flash Thining
Thought process
Output answer
Evaluation: enen still isn't quite right.
Model | Performance |
---|---|
ChatGPT o1 | ❌ |
ChatGPT o1 Pro | ✅ |
Gemini-2.0-Flash-Thinking | ❌ |
draw an ASCII Art Paint of a unicorn
o1
Thought process
Output answer

Evaluation:This is also a soul artist!
o1 Pro
Thought process
Output answer

I love it too much. But judging from the thought process, the image was found online... Not sure if it counts as cheating.
Flash Thining
Thought process
Output answer

ennnn... Hard to describe
Challenge 3: Reading comprehension
o1
Thought process
Output answer
Evaluation:This is also a soul artist!
o1 Pro
Thought process
Output answer
I love it too much. But judging from the thought process, the image was found online... Not sure if it counts as cheating.
Flash Thining
Thought process
Output answer
ennnn... Hard to describe
Model | Performance |
---|---|
ChatGPT o1 | ✅ |
ChatGPT o1 Pro | ❌ |
Gemini-2.0-Flash-Thinking | ✅⭐️ |
Read the article – https://situational-awareness.ai/from-gpt-4-to-agi/
For the drivers of progress in the coming four years following GPT-4(2023-2027), Compute, Algorithmic Efficiency, and Unhobbling, How many OOMs can each driver contribute? Could you sum up the answer and make it a markdown table?
I'll send out the standard answer first:
o1
Output answer
Close enough.
o1 Pro
Output answer
Reading comprehension lacks something.
Flash Thining
Output answer
Closest to the standard answer!!!