Google Unveils Gemini 3 Flash, Boosting AI Speed and Capability

Key Points

  • Google launches Gemini 3 Flash, a faster, more capable AI model.
  • Available via Gemini app, Search, API, Vertex AI, AI Studio, and Antigravity.
  • Benchmark scores improve across academic, reasoning, and coding tests.
  • Achieves a 33.7% score on Humanity’s Last Exam, three times higher than the previous Flash model.
  • Coding test (SWE‑Bench) gains nearly 20 points over the 2.5 branch.
  • Simple QA Verified score rises to 68.7%, closing the gap with Gemini 3 Pro.
  • Runs workloads three times faster than Gemini 2.5 Pro.
  • Token pricing drops to $0.50 per million input tokens and $3 per million output tokens.
  • Retains multimodal and interactive simulation capabilities of the Pro model.
  • Designed to serve both developers and everyday users with lower cost and higher speed.

Google releases Gemini 3 Flash, promising improved intelligence and efficiency
Gemini HLE test

Gemini HLE test

Launch and Availability

Google introduced Gemini 3 Flash as the latest addition to its Gemini family of generative AI models. The rollout follows the earlier launch of Gemini 3 Pro and expands the platform’s reach across multiple Google services. Users can access Gemini 3 Flash immediately via the Gemini app, Google Search, the Gemini API, Vertex AI, AI Studio, and the Antigravity environment, ensuring broad availability for both consumer and developer audiences.

Performance Improvements

According to Google’s benchmark results, Gemini 3 Flash outperforms the older 2.5 Flash model across a range of tests. In basic academic and reasoning assessments such as GPQA Diamond and MMMU Pro, the new model scores higher, even surpassing Gemini 3 Pro on certain metrics. The most striking improvement appears in the Humanity’s Last Exam (HLE), where Gemini 3 Flash achieved a score of 33.7 percent without tool use—roughly three times the score of its predecessor and only a few points behind Gemini 3 Pro.

Coding Capabilities

Google highlights a significant leap in coding proficiency for Gemini 3 Flash. In the SWE‑Bench Verified test, the model gained almost 20 points compared with the 2.5 branch, indicating a stronger ability to generate and understand code. This advancement narrows the historical gap where Gemini’s Pro models were the primary choice for programming tasks.

General Knowledge Accuracy

On the Simple QA Verified test, Gemini 3 Flash recorded a score of 68.7 percent, closely matching Gemini 3 Pro and far exceeding the 28.1 percent achieved by the prior Flash model. The result demonstrates a marked reduction in errors on general‑knowledge questions, reinforcing the model’s broader applicability.

Efficiency and Cost

Beyond raw performance, Gemini 3 Flash offers notable efficiency gains. The model runs workloads three times faster than Gemini 2.5 Pro while maintaining lower token pricing. Google lists the cost for one million input tokens at $0.50 and one million output tokens at $3.00. By comparison, the Pro model’s token costs are $2.00 for input and $12.00 for output, making Flash a more economical choice for developers who pay per token.

Multimodal and Interactive Features

Gemini 3 Flash retains the multimodal capabilities introduced with Gemini 3 Pro, enabling the generation of interactive simulations and mixed‑media content. While the underlying technology is consistent with the Pro model, Flash delivers these features with faster processing times and reduced expense.

Implications for Developers and Users

The launch positions Gemini 3 Flash as a versatile, cost‑effective solution for a wide array of applications—from search enhancements and conversational agents to code generation and multimedia creation. Its availability across Google’s AI ecosystem simplifies integration for developers, while end users benefit from quicker, more accurate responses in everyday interactions.

Source: arstechnica.com