Jump to:
Introduction
This article explains how to customize voice output with the commercial version AI Voice Generator using Speech Prompts, Voice Design, and Inline Voice Tags.
These tools allow you to control how text is delivered, including tone, pacing, style, and accent, as well as more detailed expressive elements like emotion and vocal sounds. Depending on the feature used, customization can be applied globally across supported voices, tailored to a specific voice, or embedded directly within the text for precise, contextual control.
Speech Prompts provide flexible, cross-voice adjustments that can be toggled on or off and configured using presets, custom instructions, or AI-generated prompts.
Voice Design offers deeper customization by modifying the characteristics of a selected voice and will override any active Speech Prompts
Inline Text Tags allow for the most granular control, letting you define delivery at the sentence or paragraph level, including expressive cues such as tone shifts or vocal sounds.
Support for these features varies by model. Speech Prompts are supported by OpenAI and Gemini (2.5 and 3.1) models, Voice Design is supported by Gemini (2.5 and 3.1) models, and Inline Text Tags are supported by Gemini 3.1 only.
While using customized voice options, keep in mind the following credit costs:
OpenAI = 1 credit/character
Gemini 2.5 = 1 credit/character
Gemini 3.1 = 2 credits/character
Speech Prompt
Speech Prompt allow you to shape how your content is read by applying a consistent tone, pacing, style, or accent across supported voices.
You can use prompts to create a reading styles that matches your content, whether that is something instructional, conversational, dramatic, or branded. To get started quickly, you can choose from built-in presets such as eLearning, Audiobooks, or Social Media, or write your own custom prompt for more specific control.
Speech Prompt works similarly to Voice Design, but instead of modifying a single voice, they apply to all compatible voices at once. This makes them useful when you want a consistent delivery style without needing to adjust each voice individually.
You can create and save multiple prompt styles and switch between them as needed. Only one Speech Prompt can be active at a time, and it can be easily toggled on or off depending on whether you want the customized style applied.
Speech Prompts are supported by OpenAI and Gemini (2.5 and 3.1) voices.
Create a Speech Prompt
Open the Voices panel. If needed, select the voice provider drop-down menu and select OpenAI or Gemini.
In the same panel, click on Speech Prompt and choose the Custom tab.
Type a description for how you want the voice to deliver, such as pace, tone, or emotion. Optionally you can select AI Auto-Generate to generate a random prompt.
Click on Preview to listen and make adjustments to your text if needed.
Select Apply Tone to save and apply your prompt.
Here are some examples for what Speech Prompts can look like:
eLearning
Deliver in an educational, clear, and professional tone for elearning. Maintain a moderate, stead pace with consistent energy and no rushed sentences.
Soft and Gentle
Read aloud this text in a soft and gentle tone. Use a slower pace with a soft, breathy vocal texture.
Advertisements
A cheerful, energetic voice with a fast pace and strong emphasis on key benefits. Sounds modern, engaging, and slightly playful.
Apply a Speech Prompt
Open the Voices panel. If needed, select the voice provider drop-down menu and select OpenAI or Gemini.
In the same panel, click on Speech Prompt.
Choose the Preset tab to pick from one of our ready-to-use prompts or go to History to view and select a custom prompt you've previously created.
You can mouse-over options to view and read its prompt and click on Use to apply it.
Toggle Speech Prompt
When a Speech Prompt is active, you'll see it in the voices panel.
To change the Speech Prompt or turn it off, simply click on the active prompt and choose Change Style or Turn Off Style.
Voice Design
Voice Design allows you to customize how a specific voice sounds by using written prompts to shape its tone, pacing, style, and even accent.
Like Speech Prompts, Voice Design uses natural language descriptions to define how the voice should be delivered. You can select from available presets or create your own by writing a custom description to match your desired style.
The key difference is that Voice Design applies your prompt directly to a single voice, changing how that voice performs whenever the design is used. Because designs are tied to individual voices, you can use different styles at the same time across multiple voices, rather than applying one style to all voices at once. You can also create multiple designs of the same voice and switch between them as needed.
If a Speech Prompt is active while using a designed voice, the Voice Design's style will take precedence and override the overall Speech Prompt.
Voice Design is supported by Gemini (2.5 and 3.1) voices.
Create a Voice Design
Open the Voices panel and navigate to the My Voices tab.
Go to Designed and select Design New Voice.
Under Step 1, use the drop-down to select a preset design or choose Create your own style to customize the design style.
If you're creating your own style, write a prompt describing how you want the voice to deliver the content. Optionally include details such as pacing, emotion, use-cases, or accent.
Under Step 2, use the drop-down to select a voice to apply the design style to.
Optionally click on Preview Audio to listen to a sample of the designed voice.
Choose Next: Name the Style to continue.
Edit or name your new designed voice and choose Create to save.
Using and Managing Designed Voices
Open the Voices panel and navigate to the My Voices tab.
Go to Designed to view your saved styles.
Mouse over the designed voice and click on Use to apply the voice, choose Edit (pencil icon) to modify the design, or choose Delete (trash icon) to delete the design.
Inline Voice Tags
New with Gemini 3.1, Inline Voice Tags allow you to control delivery directly within your text by inserting simple tags into a sentence or paragraph.
These tags can be used to adjust emotion, tone, and pacing, as well as add vocal sounds such as signs, coughs, laughter, or other expressive cues. Because they are placed inline, they apply only to the specific portion of text where they are inserted, giving you precise, contextual control over how each segment is delivered.
Inline Voice Tags are designed to be easy to use, letting you modify delivery naturally as you write, without needing to adjust global settings or voice configurations.
When using Gemini 3.1, Inline Voice Tags will still work alongside other customization features such as Voice Design or Speech Prompts.
Using Inline Voice Tags
To use Inline Voice Tags, first make sure you are using a supported voice model. Open the Voices panel and select Gemini Voices, then ensure the model is set to Gemini 3.1, as Inline Tags are not supported on Gemini 2.5.
Once selected, you can insert tags directly into your text using square brackets, such as [laughs] or [excited], wherever you want to modify delivery or add a vocal sound.
Tags can include emotion or tone like [sad], [excited], [crying], pace like [very quickly], or vocal sounds like [sighs] or [coughs].
Here's some examples for how Inline Voice Tags may look:
[panicked] Oh gosh, I think someone’s in the house! [whispers] Don’t move and don't make a sound.
[laughing] I can’t believe you actually did that. [giddy] That was so amazing!
I guess that's okay, [very sarcastic with heavy emphasis] If we really have to.
Notes
Customization features support freeform natural language. Describe the tone, emotion, or style you want.
Because this is generative, delivery and vocal cues may vary with each regeneration.
Some voices may respond better to certain customizations than others.
Certain accents or tags may not produce accurate results, or may not be supported at all.
Responsible Use Reminder
Voice customization features are creative tools but they must be used responsibly. Do not use them to imitate the likeness of real individuals or to create misleading, deceptive, or harmful content. All audio generated should follow applicable laws, platform policies, and ethical guidelines.
Other Resources
General
Commercial Version AI Voice Generator
Looking for Personal Version?


