Google’s Gemini 3 is living up to the hype and creating games in one shot

Google's Gemini 3 is living up to the hype and creating games in one shot

Google’s Gemini 3 is finally here, and we’re impressed with the results, especially when it comes to building simple games.

Gemini 3 Pro is an impressive model, and early benchmarks confirm it.

For example, it tops the LMArena Leaderboard with a score of 1501 Elo. It also offers PhD-level reasoning with top scores on Humanity’s Last Exam (37.5% without the usage of any tools) and GPQA Diamond (91.9%).

Wiz

Real life results also back these numbers.

Pietro Schirano, who created MagicPath, a vibe coding tool for designers, says we’re entering a new era with Gemini 3.

In his tests, Gemini 3 Pro successfully created a 3D LEGO editor in one shot. This means a single prompt is enough to create simple games in Gemini 3, which is a big deal if you ask me.

LLMs have been traditionally bad with games, but Gemini 3 shows some improvements in that direction.

This aligns with Google’s claims that Gemini 3 Pro redefines multimodal reasoning with 81% on MMMU-Pro and 87.6% on Video-MMMU benchmarks.

“It also scores a state-of-the-art 72.1% on SimpleQA Verified, showing great progress on factual accuracy,” Google noted in a blog post.

“This means Gemini 3 Pro is highly capable of solving complex problems across a vast array of topics like science and mathematics with a high degree of reliability.”

Gemini 3 is impressive in my early tests, but adherence remains an issue

I’ve been using Claude Code for a year now, and it’s been a great help with my Flutter/Dart projects.

Gemini 3 is a better model than Claude Sonnet 4.5, but there are some areas where Claude shines.

So far, no model has come close to Claude Code, particularly with adherence, and Gemini 3 is no exception.

One of the areas is adherence.

I personally found Claude Code better for following instructions. Likewise, Claude Code is also a better CLI than Gemini 3 Pro, which gives it an edge over competitors.

For everything else, Gemini 3 is a better choice, especially if you’ve been using Gemini 2.5 Pro.

If you use LLMs, I’d recommend sticking to Sonnet 4.5 for regular tasks and Gemini 3 Pro for complex queries.

Wiz

It’s budget season! Over 300 CISOs and security leaders have shared how they’re planning, spending, and prioritizing for the year ahead. This report compiles their insights, allowing readers to benchmark strategies, identify emerging trends, and compare their priorities as they head into 2026.

Learn how top leaders are turning investment into measurable impact.





Source link