Skip to main content
The Text to Image node generates images from text prompts using AI image generation models. Use it to create visuals, illustrations, or graphics — for example, generating custom chart illustrations for reports, creating presentation visuals from descriptions, or producing marketing imagery from text briefs.

Core Functionality

  • Generate images from text descriptions using AI models
  • Support multiple providers including Google, OpenAI, Stability AI, Flux, and xAI
  • Configure output size and aspect ratio
  • Use personal API keys for dedicated access

Tool Inputs

  • Provider * — (Enum (Dropdown), default: Google) Select the image generation provider
  • Model * — (Enum (Dropdown), default: gemini-2.5-flash-image) Select the text-to-image model. Options vary by provider
  • Prompt * — (String) Text description of the image to generate. Be specific about desired content, style, and composition
  • Size — (Enum (Dropdown), default: 1024x1024) Output image dimensions. Only visible for OpenAI provider
  • Aspect Ratio — (Enum (Dropdown), default: 1:1) Output aspect ratio. Workflows only
  • Use Personal API Key — (Boolean, default: No) Toggle to use your own API key
  • Api Key — (String) Your API key. Only visible when Use Personal API Key is enabled
* indicates a required field

Tool Outputs

  • image — (Image) The generated image

Overview

The Text to Image tool in agents allows the AI to generate images during conversations based on user descriptions. The agent interprets the user’s request and creates appropriate prompts for the image generation model.

Use Cases

  • Presentation visual creation — Users describe the visual they need and the agent generates it for their presentation.
  • Report illustration — Generate custom illustrations or diagrams to accompany financial reports.
  • Marketing content — Create visual content from text descriptions for marketing materials.
  • Concept visualization — Turn abstract financial concepts into visual representations.

How It Works

  1. Add the tool to your agent. In the agent builder, click Add Tool and select Text to Image from the available tools.
Agent tool panel showing Generate Image (Text to Image) tool in the tool list
  1. Configure input fields. Each field can either be filled automatically by the agent based on conversation context, or locked to a fixed value:
    • Provider — Select the image generation provider
    • Model — Choose the generation model
    • Prompt — The agent fills this based on the user’s request
    • Quality — Set the output quality
Generate Image tool configuration showing fields with sparkle icon to toggle between dynamic and static values
  1. Write the Tool Description. Describe what the tool does so the agent knows when to use it.
  2. Set Auto Run behavior. Choose: Auto Run, Require User Approval, or Let Agent Decide.
Generate Image tool requiring user approval
  1. Test the tool. Ask the agent to generate an image and verify the output.

Settings

SettingTypeDefaultDescription
ProviderDropdownGoogleThe image generation provider.
ModelDropdowngemini-2.5-flash-imageThe text-to-image model.
Use Personal API KeyBooleanNoUse your own API key.

Best Practices

  • Write detailed prompts. The more specific the description, the better the output. Include details about composition, style, colors, and content.
  • Use Require User Approval. Let users review generated images before they’re used in outputs.
  • Match the provider to your quality needs. Different providers excel at different image styles.

Common Issues

For troubleshooting common issues with this node, see the Common Issues documentation.