> ## Documentation Index > Fetch the complete documentation index at: https://docs.vectorshift.ai/llms.txt > Use this file to discover all available pages before exploring further. # LLM & media tools > Generate and analyze images, audio, and documents. Add these tools with `AgentTools.(tool_name="...", ...)` or `agent.add_tool.(tool_name="...", ...)`. Every tool requires a unique `tool_name=`. Each entry lists the tool's configuration parameters. See the [Agent reference](/sdk/agent/reference) for attaching and running tools. ## `ai_image_to_image` Modify and edit images using AI by providing modification instructions Platform docs: [Image to Image](https://docs.vectorshift.ai/platform/pipelines/multi-modal/image-to-image) ```python Sync theme={"languages":{}} AgentTools.ai_image_to_image(tool_name="...", use_personal_api_key=True, provider="google", api_key="...") ``` **Parameters** Use your personal API key Select the model provider. One of: `google`, `openai` Select the aspect ratio for the output image. Array of input images to modify. Provide 1-3 images for best results. Select the image-to-image model Tell the AI model how you would like it to modify the images. Be as specific as possible. For example, you can instruct the model to change colors, add elements, apply artistic styles, or blend multiple images. Must not be empty. Select the size. Input your personal API key from the model provider. Note: if you do not have access to the selected model, the workflow will not run ## `ai_image_to_text` Generate Text from Image using AI Platform docs: [Image to Text](https://docs.vectorshift.ai/platform/pipelines/multi-modal/image-to-text) ```python Sync theme={"languages":{}} AgentTools.ai_image_to_text(tool_name="...", use_personal_api_key=True, json_response=True, stream=True, image="...") ``` **Parameters** Use your personal API key Return the response as a JSON object Stream the response The image to analyze. For agent tool calls, pass an existing file reference such as $history.MESSAGE_ID, $tool.CALL\_ID.OUTPUT\_KEY, or \$input.NAME; never pass an empty string. The maximum number of tokens to generate Instructions on what you want to analyze from the image. Tell the AI model how you would like it to respond. Be as specific as possible. For example, you can instruct the model on what tone to respond in or how to respond given the information you provide The temperature of the model The top-p value The JSON schema to use for the response Input your personal API key from the model provider. Note: if you do not have access to the selected model, the workflow will not run ## `ai_speech_to_text` Generate Text from Audio using AI Platform docs: [Speech to Text](https://docs.vectorshift.ai/platform/pipelines/multi-modal/speech-to-text) ```python Sync theme={"languages":{}} AgentTools.ai_speech_to_text(tool_name="...", use_personal_api_key=True, provider="deepgram", audio="...", api_key="...") ``` **Parameters** Use your personal API key Select the model provider. One of: `deepgram`, `openai` The audio for conversion Select the speech-to-text model Select the tier Input your personal API key from the model provider. Note: if you do not have access to the selected model, the workflow will not run ## `ai_text_to_image` Generate Image from Text using AI Platform docs: [Text to Image](https://docs.vectorshift.ai/platform/pipelines/multi-modal/text-to-image) ```python Sync theme={"languages":{}} AgentTools.ai_text_to_image(tool_name="...", use_personal_api_key=True, provider="flux", api_key="...") ``` **Parameters** Use your personal API key Select the model provider. One of: `flux`, `openai`, `stabilityai`, `xai` Select the aspect ratio. Tell the AI model how you would like it to respond. Be as specific as possible. For example, you can instruct the model to use bright colors. Must not be empty. Select the size. Input your personal API key from the model provider. Note: if you do not have access to the selected model, the workflow will not run ## `ai_text_to_speech` Generate Audio from text using AI Platform docs: [Text To Speech](https://docs.vectorshift.ai/platform/pipelines/multi-modal/text-to-speech) ```python Sync theme={"languages":{}} AgentTools.ai_text_to_speech(tool_name="...", use_personal_api_key=True, text="...", api_key="...") ``` **Parameters** Use your personal API key The string input for conversion. Select the text-to-speech model Select the voice Input your personal API key from the model provider. Note: if you do not have access to the selected model, the workflow will not run ## `reducto_extract` Extract structured data from documents using Reducto ```python Sync theme={"languages":{}} AgentTools.reducto_extract(tool_name="...") ``` **Parameters** Your personal Reducto API key. Enable agentic deep extraction for higher accuracy. Uses iterative verification against the source material. Documents to extract data from (up to 2,500 pages per document). A JSON schema defining the structure of data to extract. Use descriptive field names. Instructions for how the AI should extract and verify data from the documents. Use your own Reducto API key instead of the platform default. Return citation bounding boxes for extracted fields.