Unleashing the Power of Nano Banana: Prompt Guide and Hands-On Experience - Scuti Ai

by Hoang The Canh

September 12, 2025

Introduction

In September 2025, Google officially launched Nano Banana – a new image generation feature within the Gemini ecosystem. This marks a significant milestone, not only for its speed and lightweight performance, but also for its ability to maintain character consistency, edit details using natural language, and combine multiple image sources into a cohesive final composition.

Unlike platforms such as Midjourney or Stable Diffusion, which lean heavily toward creative artistry, Nano Banana focuses on practical applications: supporting marketing design, visual education, digital content production, and even academic research. Its strength lies in delivering sharp, emotionally rich images with fine control through prompts — all without requiring powerful hardware.

In this article, I — drawing on years of experience researching and deploying AI — will break down Nano Banana’s Prompt Guide, share effective prompting strategies, and present three real-world use cases so readers can quickly grasp and apply them.

Summary of Nano Banana’s Prompt Guide

According to the official documentation, Nano Banana supports three image generation modes:

1. Text-to-Image

Enter a detailed description → AI generates an image from the text.
Best for creating visuals from completely new ideas.

2. Image + Text-to-Image (Editing)

Provide a base image and use a prompt to edit, add, or remove details.
Advantage: Preserves the main layout while changing only the elements you specify.

3. Multi-Image Fusion

Combine multiple images to form a unified composition.
Ideal for illustration design where multiple separate elements need to be merged.

Key Point: Nano Banana does not perform well with fragmented “keyword list” prompts. It works best with contextual, story-like prompts.

Prompting Guide and Strategies

This, in my opinion, is the core of unlocking Nano Banana’s potential. Below are principles and strategies I have distilled — with concrete examples:

1. Describe, Don’t List

Common mistake:
“cat, moon, forest” → results in a disjointed image; AI struggles to infer intent.

Better approach:
“A black cat sitting quietly on a mossy rock under the moonlight, surrounded by a misty forest.”

Why it works: Storytelling prompts help AI understand space, relationships, and produce coherent images.

2. Add Style, Emotion, and Technical Cues

You can “direct” your image by adding:

Mood: calm, dramatic, mysterious
Art style: ukiyo-e, watercolor, cyberpunk neon
Camera cues: 50mm lens, wide-angle, golden-hour lighting

Example:

Basic Prompt: “A Vietnamese street at night.”
Enhanced Prompt: “A Vietnamese street at night, illuminated by neon signs and glowing lanterns, cinematic cyberpunk style, wide-angle shot, moody atmosphere.”

The enhanced version produces richer, more visually engaging results.

3. Iterative Prompting

Strategy: Write a basic prompt → generate → analyze result → add or remove details.

Example:

First Prompt: “A woman in Ao Dai standing in a rice field.”
Result: Accurate but plain.
Refined Prompt: “A woman in a flowing white Ao Dai standing in a golden rice field at sunrise, soft pastel tones, cinematic feel.”
Result: Artistic, visually rich, and closer to the desired emotion.

4. Consistency & Control

Nano Banana can keep characters consistent across multiple images.

Technique: Repeat fixed descriptions across prompts (e.g., “a young man with short black hair, wearing a blue jacket”).
This is particularly useful for building character illustrations for stories, games, or brand identity.

5. Negative Prompts

Use these to avoid common issues: distorted hands, random text, watermarks.

Example:
“…, without text, no watermark, hands clearly drawn.”

This keeps the image clean and aligned with your intention.

6. Think Like a Film Director

When writing prompts, imagine describing a film frame for a director.
This leads to better depth, lighting, and emotional clarity in the image.

Three Real-World Prompt Experiments

1. Text-to-Image

Prompt:
A photorealistic shot of an elderly Vietnamese woman sitting in a bamboo chair, sipping herbal tea under the morning sun filtering through wooden window slats, warm and serene mood, soft golden-hour lighting, 50 mm lens.

Expected Result:
A realistic photo-like image, warm lighting, detailed bamboo textures and skin — a touching “photograph.”

2. Image + Text-to-Image (Editing)

Example 1:

Image:

Prompt:
Using this image of a modern Vietnamese street market at dusk, enhance it by adding glowing lanterns overhead, neon reflections on wet cobblestones, and a thin layer of mist for atmospheric depth, while preserving all vendors and characters.

Expected Result:
A normal evening market transformed into a cinematic scene — lanterns and neon lights creating a cyberpunk vibe while retaining the authentic Vietnamese market spirit.

Result:

Example 2:

Image:

Prompt:

Create a 1/7 scale commercialized figure of the character in the illustration, in a realistic style and environment. Place figure on a computer desk in front of computer screen, using a circular transparent acrylic base without any text. On the computer screen, display the Z-Brush modeling process of the figure. Next to the computer screen, place a BANDAI-style toy packaging box printed with the original artwork.

Expected result:

A realistic, commercial-style product photo: a 1/7 scale figure placed on a clear circular acrylic base on a computer desk. The monitor shows the Z-Brush modeling process of the figure, and next to it is a BANDAI-style packaging box printed with the original artwork. Soft studio lighting, clean composition, and vivid colors give the impression of an official product advertisement.

Result:

3. Multi-Image Fusion

Prompt:
Combine these images: a rice paddy field at sunrise, a silhouette of a Vietnamese Ao Dai, and a close-up of a traditional bánh chưng. Create a harmonious composition where the Ao Dai figure stands in the foreground, the paddy sunrise forms the background, and the bánh chưng subtly overlays in the bottom corner as a cultural emblem. Soft cinematic lighting, pastel color grading.

Expected Result:
A culturally rich composition: Ao Dai in the morning sun, golden rice fields, and bánh chưng representing tradition.

Result:

Prompt Collection and Examples

Case 1: Hand Drawing Controls Multi-Character Poses

Prompt: Have these two characters fight using the pose from Figure 3. Add appropriate visual backgrounds and scene interactions,Generated image ratio is 16:9

Case 2: OOTD Outfit

Prompt: Choose the person in Image 1 and dress them in all the clothing and accessories from Image 2. Shoot a series of realistic OOTD-style photos outdoors, using natural lighting, a stylish street style, and clear full-body shots. Keep the person’s identity and pose from Image 1, but show the complete outfit and accessories from Image 2 in a cohesive, stylish way.

I discovered a helpful GitHub repository that compiles clear examples and detailed prompt guides. You can explore it to find inspiration and learn to use Nano Banana to its fullest:
GitHub Repository

Conclusion

Nano Banana has proven that the new generation of AI image tools go beyond simply “making something pretty” — they bring control, usability, and consistency.

By studying its prompt guide and applying the right strategies — from storytelling instead of listing, adding style and mood, to using negative prompts — we can transform ideas into aesthetically pleasing and practically useful visuals.

From my personal experience, I believe Nano Banana will become an essential tool for content creators, marketers, educators, and researchers. It’s not just about “generating images,” but about expanding the way we think, describe, and communicate with AI.

Get In Touch

Gallery

Introduction

Summary of Nano Banana’s Prompt Guide

1. Text-to-Image

2. Image + Text-to-Image (Editing)

3. Multi-Image Fusion

Prompting Guide and Strategies

1. Describe, Don’t List

2. Add Style, Emotion, and Technical Cues

3. Iterative Prompting

4. Consistency & Control

5. Negative Prompts

6. Think Like a Film Director

Three Real-World Prompt Experiments

1. Text-to-Image

2. Image + Text-to-Image (Editing)

3. Multi-Image Fusion

Prompt Collection and Examples

Conclusion

Quick Links

Blog

Executing a Multi-File Complex Development Task Using Dynamic Workflows in Claude Code

Claude Fable 5 — Tổng quan, tính năng nổi bật và góc nhìn ứng dụng thực tế

KHÁM PHÁ ANTHROPIC CLI “ANT” – CÀI ĐẶT, THỬ NGHIỆM VÀ ỨNG DỤNG

Facebook