Advertisement

Gemini 2.5 Pro, claimed to be far ahead of the competition, has been released with great fanfare: comprehensively surpassing other LLMs and topping the global rankings

In the early hours of today, Google's newly released Gemini 2.5 Pro has become the undisputed strongest AI model globally. It tops the charts with a significant advantage of 40 LMarena points higher than Grok 3, which was released last month. Noam Shazeer's deep involvement suggests that Gemini 2.5 Pro may have integrated the core technology of Flash Thinking. The release of 2.5 Pro before 2.5 Flash is an interesting rhythm~

Simon Willison, Paul Gauthier (aider), Andrew Carr, and several other industry experts quickly shared their views on Gemini 2.5 Pro, all agreeing: "This model undoubtedly meets the SOTA (state-of-the-art) standard."

Model Characteristics

1. Gemini 2.5 Pro Tops the Charts: Performance soars, comprehensively surpassing competitors

The latest releaseGemini 2.5 Pro Experimental (code name Nebula)quickly topped the LM Arena leaderboard with record-breaking high scores, comprehensively defeating Grok-3 and GPT-4.5, which were previously ranked higher. This model has taken the lead in multiple fields, including mathematical calculations, creative writing, instruction following, long-query processing, and multi-round conversations, demonstrating a significant leap in performance.

2. Google's speed is jaw-dropping, leaving netizens exclaiming "unbelievable"

Many users are astonished by the speed at which Google launched Gemini 2.5, even referencing Sergey Brin, one of Google's founders, who reportedly made a request in a report from *The New York Times*:he Ve"Google should stop making 'nanny-like' products." Another user was even more direct, stating: "The progress is so fast, it's hard to believe!" The community generally expressed amazement at Google's development speed in the AI field.

3. Gemini 2.5 Pro has outstanding encoding capabilities, setting a new record in the Aider multilingual benchmark.

Gemini 2.5 Pro Experimental achieved excellent results in the Aider Polyglot benchmark test, with an overall score of 74% and a diff score of 68.6%, establishing a new state-of-the-art (SOTA) benchmark, significantly surpassing the previous Gemini model. User feedback indicates that this model is particularly adept at generating architecture diagrams from code repositories, further solidifying its top-tier performance in programming tasks. However, some users have pointed out that the model exhibits certain instability in specific coding performances, and the current rate limits are relatively strict.

Currently, Gemini 2.5 Pro has not announced its pricing plan, but users can now experience a limited-speed "experimental version" for free.

Official explanation

What is "thinking AI"?

The Gemini 2.5 series of models belong to the category of "thinking models," which are capable of reasoning and analyzing before responding, significantly enhancing the overall performance and accuracy of the model.

In the field of artificial intelligence, "reasoning" does not merely mean classification and prediction; it also reflects AI's ability to analyze information, draw logical conclusions, incorporate context and nuances, and make rational decisions.

For a long time, Google has been committed to improving the intelligence and reasoning capabilities of AI, such as through techniques like reinforcement learning and chain-of-thought prompting. Building on this foundation, we released our first thinking model: Gemini 2.0 Flash Thinking.

Now, Gemini 2.5 further integrates a more powerful base model with finer post-training techniques, elevating AI reasoning capabilities to new heights. In the future, Google will embed this "thinking ability" directly into all Gemini models, enabling them to solve more complex problems and support more advanced, context-aware AI agents.

Deeply Enhanced Reasoning Capabilities

Gemini 2.5 Pro leads in multiple benchmarks for advanced reasoning tasks. Without using cost-increasing testing techniques like "majority voting," Gemini 2.5 Pro has taken the industry lead in the GPQA math benchmark and the AIME 2025 science benchmark.

Moreover, in the Humanity’s Last Exam dataset, designed by hundreds of domain experts to test the limits of human knowledge and reasoning abilities, Gemini 2.5 Pro (without using additional tools) achieved an industry-leading score of 18.8%, making it the top model in the no-tool category.

Programming capabilities reach new heights

We continue to focus on enhancing AI's programming capabilities, achieving a significant leap from version 2.0 in Gemini 2.5, with further optimizations planned. Gemini 2.5 Pro excels at creating visually impressive web applications and intelligent code programs, and possesses outstanding code conversion and editing abilities. In the widely recognized SWE-Bench Verified (smart code evaluation benchmark), Gemini 2.5 Pro scored a high 63.8% with its custom agent solution.

For example, with just a brief instruction, Gemini 2.5 Pro can automatically generate executable code using its powerful reasoning ability, quickly creating a complete video game application.

Multimodality and Long Context Windows

Gemini 2.5 inherits and enhances the inherent multimodal capabilities and long context windows of the Gemini model. The Gemini 2.5 Pro is released with a 1 million token context window (which will be expanded to 2 million in the future), showing significant performance improvements over its predecessor. It can understand massive datasets and handle complex problems from various information sources such as text, audio, images, videos, and even complete code repositories.

Free Open Experience, More Functions Coming Soon

According to news from Google engineer Casper Hansen, Gemini 2.5 Pro has been made freely available for all users:

  • Developers can immediately experience it through Google AI Studio;
  • Also available for selection within the Gemini App by Gemini Advanced users;
  • It will also be available on Google Cloud's Vertex AI platform soon.

Google will release pricing details in the coming weeks, enabling users to apply Gemini 2.5 Pro in larger-scale production scenarios under higher rate limits.

In addition, Google researcher Steven Heidel specifically shared new image generation capabilities added in Gemini 2.5 Pro: Users can not only freely set the aspect ratio of images but also generate multiple different image variants at once, significantly enhancing the flexibility and creativity of generated images.

Trial use