
- If toggle is on “Variable”, reference audio files from other nodes
- If toggle is on “Upload”, upload an audio file directly on the node
Node Inputs
- Audio: The audio for conversion
- Type:
Audio
- Type:
Node Parameters
- Provider: Provider of the AI model you want to use. The default provider is OpenAI.
- Model: Model name you want to use.
- Use Personal Api Key: This allows you to enter your API key.
Node Outputs
- Text: Audio as converted to text
- Type:
Text
- Example usage:
{{ai_speech_to_text_0.text}}
- Type:
Example
The below example shows a pipeline that takes audio input, converts it to text, processes it with an LLM, and converts the response back to audio.- Input Node: Contains the input audio
- Speech to Text Node: Converts the audio to text
- Audio:
{{input_0.audio}}
- Audio:
- LLM Node: Processes the text / Answers the Question
- Input:
{{ai_speech_to_text_0.text}}
- Input:
- Text to Speech Node: Converts the LLM response to audio
- Text:
{{openai_0.response}}
- Text:
- Output: The final audio response
- Output:
{{ai_text_to_speech_0.audio}}
- Output:

Pricing
Provider | Model | Input cost per minute |
---|---|---|
OpenAI | whisper-1 | 0.006 |
Deepgram | nova-3 | 0.0043 |
Deepgram | nova-2 | 0.0043 |
Deepgram | nova | 0.0043 |
Deepgram | enhanced | 0.0145 |
Deepgram | base | 0.0123 |