Google Announces Gemini 2.5 Flash Image (nano-banana): A New Era in AI Image Generation and Editing

Explore Google’s latest AI image model, Gemini 2.5 Flash Image (nano-banana). This article delves into its powerful revolutionary features like multi-image fusion, character consistency, and natural language editing, and how it brings unprecedented creative control to developers and businesses.

Let’s be honest, the world of AI image generation is both fascinating and a bit of a headache. You’ve probably been there: you want the same character to appear in different scenes, but the AI keeps drawing a “stranger who looks kinda similar.” Or you just want to tweak a small detail in an image, only to have the entire picture ruined.

These little frictions in the creative process are the very pain points that creators are most eager to solve.

Just today, Google has responded. They have officially launched what can be called the industry’s top image generation and editing model—Gemini 2.5 Flash Image (codename nano-banana). This is not just a small update, but more like a complete evolution. It allows creators to seamlessly merge multiple images, maintain amazing character consistency across different scenes, and even make precise local edits with a single sentence.

When Gemini 2.0 Flash was first launched, everyone loved its low latency, high cost-effectiveness, and ease of use. But at the same time, the community also gave a lot of feedback: we need higher quality images and more powerful creative control.

Now, Gemini 2.5 Flash Image is here to deliver just that.

Currently, developers can use this model through the Gemini API and Google AI Studio, while enterprise users can import it through the Vertex AI platform. As for the price, Gemini 2.5 Flash Image is priced at $0.3 per million inputs and $30 per million output tokens, which means the cost of generating one image is about $0.039 (each image outputs 1290 tokens).

The data speaks for itself: The performance of Gemini 2.5 Flash Image

Talk is cheap, so let’s look at the data. According to benchmark tests from lmarena.ai and Google’s internal prompt set tests, Gemini 2.5 Flash Image has shown leading capabilities in several key metrics, especially in “overall preference” and “character” generation, where its performance even surpasses other well-known models on the market.

Here is a comparison of Elo ratings with other mainstream models (higher scores indicate better performance):

Category	Gemini 2.5 Flash Image	ChatGPT 4o / GPT Image 1	FLUX.1 Kontext [max]	Qwen Image Edit	Gemini 2.0 Flash Image
Character	~1230	~1100	~1020	~920	~860
Creative	~1120	~1050	~970	~990	~880
Object/Env	~1080	~1020	~1000	~1010	~900
Stylization	~1050	~1180	~950	~1100	~730

Rank (UB) ↑	Model ↑	Score ↑	95% CI (±) ↑	Votes ↑	Organization ↑	License ↑
1	`gemini-2.5-flash-image-preview (nano-banana)`	1362	±2	2,521,035	Google	Proprietary
2	`flux-1-kontext-max`	1191	±3	357,196	Black Forest…	Proprietary
3	`flux-1-kontext-pro`	1174	±2	2,015,530	Black Forest…	Proprietary
3	`gpt-image-1`	1170	±3	1,026,399	OpenAI	Proprietary
5	`flux-1-kontext-dev`	1152	±3	1,584,400	Black Forest…	Proprietary
6	`qwen-image-edit`	1145	±2	1,585,904	Alibaba	Apache 2.0
6	`seededit-3.0`	1142	±4	1,285,080	Bytedance	Proprietary
8	`gemini-2.0-flash-preview-image-generation`	1093	±3	1,700,785	Google	Proprietary

Source: https://lmarena.ai/leaderboard/image-edit

From the table, it is clear that Gemini 2.5 Flash Image is in a leading position in most categories, which confirms its significant progress in generation quality and creative control.

The superpowers of Gemini 2.5 Flash Image, demonstrated in practical applications

To give everyone a more intuitive feel for its power, the “build mode” of Google AI Studio has also been significantly updated. You can not only quickly test the model’s capabilities, but also create customized AI applications with a simple prompt, and even deploy them with one click or save the code to GitHub.

Next, let’s take a look at some of the most amazing features.

Character consistency? Not a problem anymore!

One of the biggest challenges in AI image generation is “maintaining the coherence of characters or objects.” Whether it’s creating a protagonist for a storybook, producing display images of a product from different angles for e-commerce, or generating a series of stylistically consistent materials for a brand, keeping the subject unchanged is key.

Gemini 2.5 Flash Image has made a major breakthrough in this area. Now, you can easily place the same character in completely different environments or situations while fully preserving their appearance. In the official demonstration, the same woman was portrayed as a chess master, a race car driver, a soccer player, and an archer, and her facial features remained highly consistent across all images.

Imagine, developers can use this feature to generate stylistically uniform ID cards for all employees of a company with just one design template, or create a large number of property cards for a real estate website, or even generate dynamic product models for an entire product catalog.

Edit images by “talking”: Precise prompt-based editing

Besides getting characters right, precise local modification is also a major pain point. Gemini 2.5 Flash Image allows you to perform precise image editing using the most intuitive method—natural language.

What does this mean? You can use simple commands to do things like:

“Blur the background of this photo.”
“Remove the stain from the T-shirt.”
“Colorize this black and white photo.”
“Change the protagonist’s pose.”

Basically, any modification you can think of can be achieved with a single sentence. In Google’s demonstration, a user uploaded a photo of a man wearing a black shirt and an earring, and gave the command: “change my shirt color to red and remove earring.” The model accurately completed both modifications, generating a realistic photo of him wearing a red shirt and no earring.

Multi-image fusion, seamlessly creating new scenes

Gemini 2.5 Flash Image also has the ability to understand and fuse multiple input images. This feature opens up a whole new door for creative work.

You can fuse an image of a product (e.g., a table lamp) with an image of an indoor scene, and the AI will automatically generate a highly realistic composite image, as if the lamp was originally in that room. You can also redesign the color scheme or material of a space, or fuse two completely different images into a brand new work of art.

To make it easier for everyone to experience, Google has also created a template application in AI Studio called “Home Canvas”. You just need to drag and drop product and scene images to quickly create photo-realistic composite images.

Not just drawing, it also understands hand-drawn sketches

The model’s capabilities go far beyond this. It can even understand hand-drawn diagrams and interact with them based on instructions.

In one demonstration case, a developer created an application called “Gemini Co-Drawing”. It turns a simple canvas into an interactive tutor. A user can draw a right-angled triangle with two sides labeled (30 and 40) and ask in text: “Solve for x and write the correct answer in red at the position of x.” Gemini 2.5 Flash Image can not only understand the diagram and the question, but also complete the complex editing steps as instructed, filling in the correct answer “50” in red font in the diagram.

This capability brings huge imagination space to the fields of education, design, and collaboration.

How to get started? And important partners

Ready to get your hands dirty?

Developers: Can start building immediately through Gemini API and Google AI Studio.
Enterprises: Can integrate it into their workflows through the Vertex AI platform.

In addition, to make this technology accessible to a wider developer community, Google has also announced collaborations with two important platforms:

OpenRouter.ai: Gemini 2.5 Flash Image becomes the first model with image generation capabilities among the more than 480 models on OpenRouter, reaching over 3 million developers.
fal.ai: As a leading generative media development platform, the addition of fal.ai will further expand the application of Gemini 2.5 Flash Image in the developer community.

It is worth mentioning that all images created or edited by Gemini 2.5 Flash Image will include an invisible SynthID digital watermark, so that they can be identified as AI-generated or edited content when needed.

Future Outlook

This journey has just begun. The Google team is still working hard to improve the rendering of long text, provide more stable character consistency, and present more accurate real-world details in images.

They are very much looking forward to seeing what amazing works developers and creators around the world will create using Gemini 2.5 Flash Image. Your feedback will be an important driving force for its continuous improvement.

Ready to embrace the new wave of AI image creation? Come and try Gemini!

Google Announces Gemini 2.5 Flash Image (nano-banana): A New Era in AI Image Generation and Editing

The data speaks for itself: The performance of Gemini 2.5 Flash Image

The superpowers of Gemini 2.5 Flash Image, demonstrated in practical applications

Character consistency? Not a problem anymore!

Edit images by “talking”: Precise prompt-based editing

Multi-image fusion, seamlessly creating new scenes

Not just drawing, it also understands hand-drawn sketches

How to get started? And important partners

Future Outlook

DMflow.chat

videoweaver.app

DMflow.chat

DMflow.chat

videoweaver.app

DMflow.chat

Google Announces Gemini 2.5 Flash Image (nano-banana): A New Era in AI Image Generation and Editing

The data speaks for itself: The performance of Gemini 2.5 Flash Image

The superpowers of Gemini 2.5 Flash Image, demonstrated in practical applications

Character consistency? Not a problem anymore!

Edit images by “talking”: Precise prompt-based editing

Multi-image fusion, seamlessly creating new scenes

Not just drawing, it also understands hand-drawn sketches

How to get started? And important partners

Future Outlook

DMflow.chat

videoweaver.app

DMflow.chat

DMflow.chat

videoweaver.app

DMflow.chat

Recommended for You

Bringing Design to Life: A Comprehensive Guide to the Multimodal Lottie Generator OmniLottie

A Thinking AI Painter? Tencent HunyuanImage 3.0-Instruct Understands You Better for Image Editing

FASHN VTON v1.5 Debuts: High-Quality Virtual Try-On AI on Consumer GPUs, Detail Retention Better Than Ever