This connector is only available for Next.js TypeScript web build tasks.
What you can do
Image analysis
Let users upload images and get detailed descriptions, alt text, or extracted information using Gemini’s vision capabilities.
Multimodal chat
Let users send both text and images in a conversation, with Gemini responding using context from both.
Content generator
Generate blog posts, product descriptions, social media copy, or marketing content from simple prompts.
Visual search
Let users search by uploading an image and finding similar items or related information.
Document OCR
Extract text, tables, and structured data from photos of documents, receipts, or handwritten notes.
Before you connect
Get your Gemini API key from Google AI Studio.
The API key is scoped to the task you connect it in. Each task stores its own key - connecting in one task does not affect others.
Connect Gemini
- Web Browser
- Mobile App
You can connect from two places - both open the same popup.Option 1: From chatType a prompt that mentions Gemini - for example, 

Click the Gemini card, then click Connect.

After clicking ConnectA popup opens. Paste your API key and click Connect.

A green dot appears next to Gemini when the connection is active.Update or disconnectOpen Connectors and click the Gemini card. Click Edit to update the key or Disconnect to remove it from this task.

Connect Gemini and add an image upload that generates descriptions and alt text. Rocket detects the intent and shows a Connect button inline. Click it and the popup opens.Option 2: From the Connectors tabClick the ... button in the preview toolbar, then select Connectors.







Example prompts
| Use case | Prompt |
|---|---|
| Image description | Add an image upload that uses Gemini to generate a detailed description and alt text. |
| Photo search | Build a visual search where users upload a product photo and Gemini identifies similar items. |
| Document scanner | Create a receipt scanner that extracts line items, totals, and dates from uploaded photos. |
| Content writer | Build a blog post generator where users enter a topic and Gemini writes a draft with headings. |
| Image chat | Add a chat widget where users can attach images and Gemini responds with context from both text and image. |
| Social media generator | Generate Instagram captions and hashtags from an uploaded product photo using Gemini. |
| Diagram explainer | Let users upload a diagram or flowchart and Gemini explains each step in plain language. |
| Product catalog | Auto-generate product descriptions and categories from uploaded product images using Gemini. |
| Handwriting reader | Build a tool that converts handwritten notes to typed text using Gemini vision. |
| Quiz generator | Create a quiz from an uploaded textbook page using Gemini to extract key concepts and questions. |
Tips
- Best for multimodal tasks. If your app needs to understand images, screenshots, or documents alongside text, Gemini is the strongest choice among Rocket’s AI connectors.
- Image size limits apply. Images passed inline must be under 20MB per request. For larger files, Rocket uses the Gemini Files API which supports up to 100MB.
- API billing is separate from Rocket. Google charges based on token usage and input type. Image inputs consume more tokens than text. See Google AI pricing.
- Key is scoped to this task. Each Rocket task connects to one Gemini API key.

