20.2 C
Canada
Thursday, June 11, 2026
HomeTechnologyGoogle releases Gemini 3.1 Professional: Benchmarks, learn how to strive it

Google releases Gemini 3.1 Professional: Benchmarks, learn how to strive it


Google launched its newest core reasoning mannequin, Gemini 3.1 Professional, on Thursday. Google says that Gemini 3.1 Professional achieved twice the verified efficiency of three Professional on ARC-AGI-2, a preferred benchmark that measures a mannequin’s logical reasoning.

Google initially launched Gemini 3 and three Professional in November, and this new launch reveals simply how briskly AI corporations are introducing new and up to date fashions. Gemini 3.1 Professional is the brand new core mannequin powering Gemini and varied Google AI instruments, akin to Gemini 3 Deep Assume. Google says it is designed to supply extra artistic options.

“3.1 Professional is designed for duties the place a easy reply is not sufficient, taking superior reasoning and making it helpful on your hardest challenges,” a Google weblog submit states. “This improved intelligence may help in sensible purposes — whether or not you’re searching for a transparent, visible clarification of a fancy subject, a technique to synthesize knowledge right into a single view, or bringing a artistic venture to life.”

Here is the whole lot we all know to date about Gemini 3.1 Professional, together with the way it compares to the most recent fashions from Anthropic and OpenAI, and learn how to strive it your self.

Easy methods to strive Gemini 3.1 Professional

Beginning right now, Google is rolling out Gemini 3.1 Professional within the Gemini App, the Gemini APIA, and in Pocket book LM. Free customers will be capable to strive 3.1 Professional within the Gemini app, however paid customers on Google AI Professional and AI Extremely plans could have increased utilization charges. Inside Pocket book LM, solely these paid customers could have entry to three.1 Professional, not less than, for now. Coders and enterprise customers also can entry the brand new core mannequin through builders and enterprises can entry 3.1 by means of AI Studio, Antigravity, Vertex AI, Gemini Enterprise, Gemini CLI, and Android Studio.

Gemini 3.1 Professional was already accessible for Mashable editors utilizing Gemini. To strive it for your self, head to Gemini on desktop or open the Gemini cellular app.

screenshot showing animation from gemini 3 pro

Left:
Two outcomes of the identical animation immediate.
Credit score: Google

Proper:
Credit score: Google


Why Gemini 3.1 Professional issues

When Google launched Gemini 3 Professional in November, the mannequin was so spectacular that it allegedly prompted OpenAI CEO Sam Altman to declare a code crimson. As Gemini 3 Professional surged to the highest of AI leaderboards, OpenAI reportedly began shedding ChatGPT customers to Gemini. The newest core ChatGPT mannequin, GPT-5.2, has tumbled down the rankings on leaderboards like Enviornment (previously generally known as LMArena), shedding important floor to opponents akin to Google, Anthropic, and xAI.

Gemini 3 Professional was already outperforming GPT-5.2 on many benchmarks, and with a extra superior pondering mannequin, Gemini might transfer even additional forward.

Gemini 3.1 Professional: Benchmark efficiency

Google launched benchmark efficiency knowledge displaying that Gemini 3.1 Professional outperforms earlier Gemini fashions, Claude Sonnet 4.6, Claude Opus 4.6, and GPT-5.2. Nevertheless, OpenAI’s new coding mannequin, GPT-5.3-Codex, beat Gemini 3.1 Professional on the verified SWE-Bench Professional benchmark, in line with Google itself.

Notable highlights from Gemini 3.1 Professional’s benchmark outcomes embrace:

  • 44.4 p.c on Humanity’s final examination, in comparison with 40.0 p.c for Claude Opus 4.6 and 34.5 p.c for GPT-5.2

  • 77.1 p.c on ARC-AGI-2, in comparison with 31.1 p.c for Gemini 3 Professional, 68.8 p.c for Claude Opus 4.6, and 52.9 p.c for GPT-5.2

  • 94.3 p.c on GPQA Diamond, in comparison with 91.9 p.c for Gemini 3 Professional, 91.3 p.c for Claude Opus 4.6, and 92.4 p.c for GPT-5.2

  • 80.6 p.c on SWE-Bench Verified, in comparison with 76.2 p.c for Gemini 3 Professional, 80.8 p.c for Claude Opus 4.6, and 80.0 p.c for GPT-5.2

  • 54.2 p.c on SWE-Bench Professional (Public), in comparison with 43.3 p.c for Gemini 3 Professional, 55.6 p.c for GPT-5.2, and 56.8 p.c for GPT-5.3-Codex

  • 92.6 p.c on MMLU, in comparison with 91.1 p.c for Claude Opus 4.6 and 89.6 p.c for GPT-5.2

Google launched a picture displaying the total benchmark outcomes for Gemini 3.1 Professional:


Disclosure: Ziff Davis, Mashable’s dad or mum firm, in April 2025 filed a lawsuit towards OpenAI, alleging it infringed Ziff Davis copyrights in coaching and working its AI techniques.



RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments