voiceelevenlabsttsaitutorial

Adding Voice to Your Product with ElevenLabs

Learn what text-to-speech is, how the ElevenLabs Conversational AI agent works, and how to set up a voice-powered experience in your product — step by step, no prior experience required.

Builder Guides

13 min read

Adding Voice to Your Product with ElevenLabs

Listen

0:00 / 0:00

Imagine opening an application and having it greet you by name, walk you through a process out loud, or answer your questions in a natural speaking voice. Not a robotic monotone. Not a pre-recorded clip. A real, fluid, human-sounding voice that responds to what you say.

This is not science fiction. This is what modern text-to-speech and conversational AI make possible — and it is more accessible than most people realize.

This guide walks you through the entire process: what text-to-speech is, what the ElevenLabs Conversational AI agent does, and how to set one up for your own product. No prior experience with voice technology is needed.

What Is Text-to-Speech?

How text-to-speech works — written text flows through AI processing and emerges as natural speech

Text-to-speech — often abbreviated as TTS — is technology that converts written words into spoken audio. You give it text, and it gives you back a voice reading that text aloud.

Think about how audiobooks work. A human narrator sits in a studio, reads a book into a microphone, and that recording is packaged as an audiobook. Text-to-speech does the same thing, except the narrator is artificial intelligence. There is no studio, no microphone, no recording session. You provide the words, and the AI generates a human-sounding voice in real time.

Why TTS Matters for Products

For decades, TTS sounded robotic and unnatural — think of those early GPS voices that mispronounced street names and spoke in a stilted, mechanical cadence. People tolerated it because the alternative was silence.

That has changed dramatically. Modern TTS systems produce voices that are nearly indistinguishable from real humans. They handle emphasis, pacing, emotion, and even conversational nuances like pausing before an important point.

This matters for product builders because voice is one of the most natural ways humans communicate. When your product can speak, it becomes more accessible, more engaging, and more human. Some practical applications:

Onboarding guides that talk new users through your product instead of showing walls of text
Accessibility features that read content aloud for users with visual impairments
Customer support agents that answer questions by voice, available around the clock
Content narration that turns blog posts, articles, or documentation into listenable audio
Interactive tutorials that guide users step by step with spoken instructions

What Is ElevenLabs?

The ElevenLabs platform — an AI agent at a dashboard with a voice library

ElevenLabs is a company that specializes in AI voice technology. They offer several products, but the ones most relevant to product builders are:

Text-to-Speech API — send text, receive natural-sounding audio
Voice Library — a collection of pre-made voices you can use, ranging from warm and friendly to authoritative and professional
Voice Cloning — create a synthetic version of a specific voice from a short audio sample
Conversational AI Agent — a complete voice assistant that can listen, understand, think, and speak back in real time

The Conversational AI Agent is the most powerful of these because it is not limited to reading text aloud. It can have actual conversations.

How the Conversational AI Agent Works

A traditional text-to-speech system is one-directional: you give it text, it speaks. The ElevenLabs Conversational AI Agent is bidirectional: it listens to what a person says, processes the meaning, decides how to respond, and speaks the response — all within a few hundred milliseconds.

Here is what happens behind the scenes when someone talks to an ElevenLabs agent:

The user speaks — their microphone captures the audio
Speech-to-text — the audio is transcribed into text so the AI can understand it
Language model processing — an AI language model (like the ones that power chatbots) interprets what the user said and generates an appropriate response
Text-to-speech — the response text is converted into natural-sounding audio
The user hears the reply — the audio plays through their speakers

This entire cycle happens fast enough that the conversation feels natural, like talking to another person on the phone.

Setting Up Your ElevenLabs Account

Getting started with ElevenLabs takes less time than making a cup of coffee. Here is the process.

Step 1: Create Your Account

Visit elevenlabs.io and sign up for an account. ElevenLabs offers a free tier that gives you enough credits to experiment and build a prototype. You can upgrade later if your product needs more usage.

After signing up, you land on the ElevenLabs dashboard. This is your home base — where you manage voices, create agents, and monitor usage.

Step 2: Get Your API Key

Connecting your application to ElevenLabs through an API key

Your API key is what connects your application to ElevenLabs. Think of it as a membership card — it tells ElevenLabs that your application has permission to use their services.

To find your API key:

Click on your profile icon in the bottom-left corner of the dashboard
Select Profile + API key
Copy your API key

Important: treat your API key like a password. Do not paste it into your code directly, do not share it in public repositories, and do not include it in client-side code that users can inspect. Store it in an environment variable on your server.

Step 3: Explore the Voice Library

Before creating your agent, spend a few minutes listening to the available voices. ElevenLabs provides a library of pre-made voices with different characteristics — some warm and conversational, others crisp and professional, and everything in between.

Choosing a voice — evaluating different voice characteristics for your product

When choosing a voice, consider:

Your brand personality — a playful consumer app might use a warm, casual voice; a financial product might use a calm, authoritative one
Your audience — who is hearing this voice? Match the voice to what your users expect and feel comfortable with
The use case — a customer support agent needs a patient, clear voice; a fitness coach might need an energetic, motivating one

You can preview every voice in the library before committing. Listen to several, imagine them speaking your product's content, and pick the one that feels right.

Creating Your Conversational AI Agent

Assembling an AI agent — configuring personality, voice, and knowledge

Now for the exciting part — building the agent itself. Navigate to the Conversational AI section in the ElevenLabs dashboard.

Step 1: Create a New Agent

Click Create Agent. You will see a setup interface with several sections to configure. Do not feel overwhelmed — you can start with the basics and refine later.

Give your agent a name that describes its purpose. Something like "Customer Support Agent" or "Onboarding Guide" helps you identify it later, especially if you create multiple agents.

Step 2: Define the Agent's Personality

This is where you tell the agent who it is and how it should behave. ElevenLabs calls this the system prompt — a set of instructions that shape the agent's personality and responses.

Think of this like writing a job description for a new employee. You are telling the agent:

Who it is — "You are a friendly customer support assistant for [your product name]"
How it should behave — "Be concise, helpful, and patient. If you do not know the answer, say so honestly rather than guessing."
What it should not do — "Do not discuss competitors. Do not make promises about features that do not exist."
Its tone — "Speak in a warm, conversational tone. Use simple language. Avoid jargon."

Here is an example system prompt:

You are a helpful assistant for a project management application.
Your name is Alex. You help users understand how to use the app,
answer questions about features, and guide them through common tasks.

Be friendly but concise. Users are busy and appreciate clear, direct answers.
If a question is outside your knowledge, let the user know and suggest
they contact the support team.

Never discuss pricing, competitors, or make promises about upcoming features.

The more specific your prompt, the more consistently your agent behaves.

Step 3: Assign a Voice

Select the voice you previewed earlier from the Voice Library. You can also adjust settings like:

Stability — higher values make the voice more consistent; lower values add more expressiveness and variation
Clarity — controls how clear and articulate the voice sounds
Speed — how quickly or slowly the agent speaks

Start with the defaults and adjust after you hear the agent in conversation. Small changes make noticeable differences, so experiment gradually.

Step 4: Add a Knowledge Base

If you want your agent to answer questions about your specific product, you need to give it information to work with. This is the Knowledge Base.

You can upload:

Documents — PDFs, text files, or markdown files containing your product documentation
FAQs — common questions and their answers
Website content — URLs that the agent can reference

The agent does not memorize this content word for word. Instead, it understands the information and uses it to formulate natural responses. If a user asks "How do I reset my password?" and your knowledge base contains that process, the agent explains it in its own words, conversationally.

Step 5: Configure Advanced Settings

ElevenLabs provides additional settings worth exploring:

First message — what the agent says when a conversation starts (e.g., "Hi, I am Alex. How can I help you today?")
Language — the primary language the agent speaks (it supports many languages)
LLM model — which AI language model powers the agent's thinking (different models offer different speed and intelligence tradeoffs)
Max conversation duration — how long a single conversation can last before automatically ending

For your first agent, the defaults work well. You can fine-tune these after testing.

Connecting the Agent to Your Product

Once your agent is configured, you need to connect it to your application. ElevenLabs provides several integration options.

The Widget (Fastest Option)

The simplest approach is the ElevenLabs embeddable widget. You add a small code snippet to your website, and a voice conversation button appears. When users click it, they start talking to your agent.

This is ideal for:

Landing pages
Documentation sites
Support pages
Quick prototypes

The widget handles all the complexity — microphone access, audio streaming, connection management — so you can focus on your agent's personality and knowledge.

The JavaScript SDK (More Control)

For deeper integration, ElevenLabs provides a JavaScript SDK that gives you programmatic control over the conversation. This lets you:

Start and stop conversations from your own buttons
Display the conversation transcript in real time
React to specific things the agent says
Customize the entire user interface

Here is a simplified example of what using the SDK looks like:

import { Conversation } from '@11labs/client';

// Start a conversation with your agent
const conversation = await Conversation.startSession({
  agentId: 'your-agent-id-here',
  onMessage: (message) => {
    // Handle each message in the conversation
    console.log('Agent said:', message.text);
  },
  onStatusChange: (status) => {
    // React to connection status changes
    console.log('Status:', status);
  }
});

// Later, when the user is done:
await conversation.endSession();

The SDK connects to ElevenLabs through a WebSocket — a persistent connection that allows real-time, two-way audio streaming. Your application sends the user's voice to ElevenLabs, and ElevenLabs sends the agent's voice back, all happening continuously throughout the conversation.

The REST API (Maximum Flexibility)

For server-side integrations or custom architectures, the REST API gives you complete control. This is the most flexible option but requires the most development work.

Testing Your Agent

Testing the voice agent — having a real conversation to verify behavior

Before sharing your agent with users, test it thoroughly. ElevenLabs includes a built-in testing tool right in the dashboard — click the conversation button on your agent's page to start talking to it.

What to Test

Accuracy — ask questions that your knowledge base covers. Does the agent give correct answers?

Personality — does the agent maintain the tone and behavior you defined? Is it too formal? Too casual? Adjust the system prompt until it feels right.

Edge cases — ask questions the agent should not answer. Ask things outside its knowledge. Try to confuse it. A well-configured agent handles these gracefully.

Latency — is the response time fast enough for natural conversation? If there are noticeable delays, try a faster LLM model or check your network connection.

Voice quality — does the voice sound natural? Are there awkward pauses, mispronunciations, or strange inflections? Adjust the voice settings or try a different voice.

Iterating on Your Agent

Your first version will not be perfect, and that is expected. Building a good voice agent is an iterative process:

Test the agent yourself
Note what feels off — tone, accuracy, speed, personality
Adjust the system prompt, knowledge base, or voice settings
Test again
Repeat until the experience feels natural

When your agent handles the common scenarios well, invite a few trusted people to test it. Fresh perspectives reveal issues you might miss because you are too close to the project.

What This Looks Like in Practice

To make this concrete, here is how a product builder might use an ElevenLabs agent in a real product:

Scenario: An online cooking course platform

The agent's name is Chef, and its personality is warm, encouraging, and knowledgeable
When a student opens a recipe, they can click a button to ask Chef questions — "What can I substitute for heavy cream?" or "How do I know when the onions are caramelized?"
Chef answers using the course's recipe database as its knowledge base
If Chef does not know the answer, it suggests the student check the course's community forum

The student gets immediate, spoken answers without leaving the recipe page. The course creator gets a support system that works around the clock without hiring staff.

Cost and Pricing

ElevenLabs uses a credit-based pricing model. Each minute of agent conversation uses a certain number of credits, depending on the voice quality and LLM model you choose.

The free tier provides enough credits to build and test your agent. Paid plans offer more credits, higher quality voices, and lower latency. For most early-stage products, the free or starter tier is sufficient while you validate that voice adds value for your users.

Check the ElevenLabs pricing page for specific numbers — they update their plans periodically with new features and adjusted limits.

Key Takeaways

Text-to-speech (TTS) converts written text into natural-sounding spoken audio — and modern TTS is nearly indistinguishable from human speech.
ElevenLabs provides AI voice tools including TTS, voice cloning, and conversational AI agents that can have real-time spoken conversations.
Setting up an agent involves creating an account, configuring the agent's personality and voice, adding a knowledge base, and connecting it to your product.
Integration options range from a simple embeddable widget (minutes to set up) to a JavaScript SDK (more control) to a full REST API (maximum flexibility).
Testing is iterative — start simple, test with real conversations, adjust, and repeat until the experience feels natural.
Voice makes products more human — it improves accessibility, engagement, and user satisfaction when implemented thoughtfully.

Key Terms in This Article

Text-to-Speech (TTS): Technology that converts written text into spoken audio. Instead of a human reading words aloud, a computer generates a natural-sounding voice from text input — like having a professional narrator on demand.
ElevenLabs: A company that provides AI-powered voice technology. Their platform offers text-to-speech, voice cloning, and conversational AI agents that can speak and listen in real time.
Conversational AI Agent: A software program that can have spoken conversations with people. It listens to what someone says, understands the meaning, formulates a response, and speaks it back — all in real time, like talking to another person.
API (Application Programming Interface): A way for two software systems to talk to each other. When your application needs to use ElevenLabs voice features, it sends a request through the API and receives audio back. Think of it as a waiter carrying orders between you and the kitchen.
API Key: A unique password that identifies your application to a service like ElevenLabs. It proves you have permission to use the service and tracks your usage. Never share your API key publicly.
Agent ID: A unique identifier assigned to each conversational AI agent you create in ElevenLabs. Your application uses this ID to connect to the specific agent you configured.
Latency: The delay between when you say something and when the AI responds. Lower latency means faster, more natural conversations. High latency creates awkward pauses that make the experience feel robotic.
Voice Cloning: The process of creating a synthetic copy of a specific person's voice. ElevenLabs can analyze a short audio sample and generate new speech that sounds like that person — useful for creating a consistent brand voice.
Knowledge Base: A collection of information that an AI agent can reference when answering questions. You upload documents, FAQs, or product information, and the agent uses this content to give accurate, relevant responses.
WebSocket: A communication channel that stays open between your application and a server, allowing real-time two-way conversation. Unlike regular web requests that open and close, a WebSocket stays connected — essential for live voice conversations.

voiceux

Ten Tips for Using Voice in Your Products

Practical, actionable advice for adding voice to your application the right way — from choosing the right moments to speak, to handling errors gracefully, to respecting your users' preferences.

14 min read

getting-startedtutorial

You Bought the Ship Template — Here Is How to Get Your Code and Start Building

A step-by-step walkthrough of what happens after you purchase the Ship Template. Accept your GitHub invitation, create your own copy of the template, clone it to your computer, and run it locally — all explained in plain English.

11 min read

voiceelevenlabsttsaitutorial

Adding Voice to Your Product with ElevenLabs

Learn what text-to-speech is, how the ElevenLabs Conversational AI agent works, and how to set up a voice-powered experience in your product — step by step, no prior experience required.

Builder Guides

13 min read

Listen

0:00 / 0:00

This is not science fiction. This is what modern text-to-speech and conversational AI make possible — and it is more accessible than most people realize.

What Is Text-to-Speech?

How text-to-speech works — written text flows through AI processing and emerges as natural speech

Text-to-speech — often abbreviated as TTS — is technology that converts written words into spoken audio. You give it text, and it gives you back a voice reading that text aloud.

Why TTS Matters for Products

Onboarding guides that talk new users through your product instead of showing walls of text
Accessibility features that read content aloud for users with visual impairments
Customer support agents that answer questions by voice, available around the clock
Content narration that turns blog posts, articles, or documentation into listenable audio
Interactive tutorials that guide users step by step with spoken instructions

What Is ElevenLabs?

The ElevenLabs platform — an AI agent at a dashboard with a voice library

ElevenLabs is a company that specializes in AI voice technology. They offer several products, but the ones most relevant to product builders are:

Text-to-Speech API — send text, receive natural-sounding audio
Voice Library — a collection of pre-made voices you can use, ranging from warm and friendly to authoritative and professional
Voice Cloning — create a synthetic version of a specific voice from a short audio sample
Conversational AI Agent — a complete voice assistant that can listen, understand, think, and speak back in real time

The Conversational AI Agent is the most powerful of these because it is not limited to reading text aloud. It can have actual conversations.

How the Conversational AI Agent Works

Here is what happens behind the scenes when someone talks to an ElevenLabs agent:

The user speaks — their microphone captures the audio
Speech-to-text — the audio is transcribed into text so the AI can understand it
Language model processing — an AI language model (like the ones that power chatbots) interprets what the user said and generates an appropriate response
Text-to-speech — the response text is converted into natural-sounding audio
The user hears the reply — the audio plays through their speakers

This entire cycle happens fast enough that the conversation feels natural, like talking to another person on the phone.

Setting Up Your ElevenLabs Account

Getting started with ElevenLabs takes less time than making a cup of coffee. Here is the process.

Step 1: Create Your Account

After signing up, you land on the ElevenLabs dashboard. This is your home base — where you manage voices, create agents, and monitor usage.

Step 2: Get Your API Key

Connecting your application to ElevenLabs through an API key

Your API key is what connects your application to ElevenLabs. Think of it as a membership card — it tells ElevenLabs that your application has permission to use their services.

To find your API key:

Click on your profile icon in the bottom-left corner of the dashboard
Select Profile + API key
Copy your API key

Step 3: Explore the Voice Library

Choosing a voice — evaluating different voice characteristics for your product

When choosing a voice, consider:

Your brand personality — a playful consumer app might use a warm, casual voice; a financial product might use a calm, authoritative one
Your audience — who is hearing this voice? Match the voice to what your users expect and feel comfortable with
The use case — a customer support agent needs a patient, clear voice; a fitness coach might need an energetic, motivating one

You can preview every voice in the library before committing. Listen to several, imagine them speaking your product's content, and pick the one that feels right.

Creating Your Conversational AI Agent

Assembling an AI agent — configuring personality, voice, and knowledge

Now for the exciting part — building the agent itself. Navigate to the Conversational AI section in the ElevenLabs dashboard.

Step 1: Create a New Agent

Click Create Agent. You will see a setup interface with several sections to configure. Do not feel overwhelmed — you can start with the basics and refine later.

Give your agent a name that describes its purpose. Something like "Customer Support Agent" or "Onboarding Guide" helps you identify it later, especially if you create multiple agents.

Step 2: Define the Agent's Personality

This is where you tell the agent who it is and how it should behave. ElevenLabs calls this the system prompt — a set of instructions that shape the agent's personality and responses.

Think of this like writing a job description for a new employee. You are telling the agent:

Who it is — "You are a friendly customer support assistant for [your product name]"
How it should behave — "Be concise, helpful, and patient. If you do not know the answer, say so honestly rather than guessing."
What it should not do — "Do not discuss competitors. Do not make promises about features that do not exist."
Its tone — "Speak in a warm, conversational tone. Use simple language. Avoid jargon."

Here is an example system prompt:

You are a helpful assistant for a project management application.
Your name is Alex. You help users understand how to use the app,
answer questions about features, and guide them through common tasks.

Be friendly but concise. Users are busy and appreciate clear, direct answers.
If a question is outside your knowledge, let the user know and suggest
they contact the support team.

Never discuss pricing, competitors, or make promises about upcoming features.

The more specific your prompt, the more consistently your agent behaves.

Step 3: Assign a Voice

Select the voice you previewed earlier from the Voice Library. You can also adjust settings like:

Stability — higher values make the voice more consistent; lower values add more expressiveness and variation
Clarity — controls how clear and articulate the voice sounds
Speed — how quickly or slowly the agent speaks

Start with the defaults and adjust after you hear the agent in conversation. Small changes make noticeable differences, so experiment gradually.

Step 4: Add a Knowledge Base

If you want your agent to answer questions about your specific product, you need to give it information to work with. This is the Knowledge Base.

You can upload:

Documents — PDFs, text files, or markdown files containing your product documentation
FAQs — common questions and their answers
Website content — URLs that the agent can reference

Step 5: Configure Advanced Settings

ElevenLabs provides additional settings worth exploring:

First message — what the agent says when a conversation starts (e.g., "Hi, I am Alex. How can I help you today?")
Language — the primary language the agent speaks (it supports many languages)
LLM model — which AI language model powers the agent's thinking (different models offer different speed and intelligence tradeoffs)
Max conversation duration — how long a single conversation can last before automatically ending

For your first agent, the defaults work well. You can fine-tune these after testing.

Connecting the Agent to Your Product

Once your agent is configured, you need to connect it to your application. ElevenLabs provides several integration options.

The Widget (Fastest Option)

This is ideal for:

Landing pages
Documentation sites
Support pages
Quick prototypes

The widget handles all the complexity — microphone access, audio streaming, connection management — so you can focus on your agent's personality and knowledge.

The JavaScript SDK (More Control)

For deeper integration, ElevenLabs provides a JavaScript SDK that gives you programmatic control over the conversation. This lets you:

Start and stop conversations from your own buttons
Display the conversation transcript in real time
React to specific things the agent says
Customize the entire user interface

Here is a simplified example of what using the SDK looks like:

import { Conversation } from '@11labs/client';

// Start a conversation with your agent
const conversation = await Conversation.startSession({
  agentId: 'your-agent-id-here',
  onMessage: (message) => {
    // Handle each message in the conversation
    console.log('Agent said:', message.text);
  },
  onStatusChange: (status) => {
    // React to connection status changes
    console.log('Status:', status);
  }
});

// Later, when the user is done:
await conversation.endSession();

The REST API (Maximum Flexibility)

For server-side integrations or custom architectures, the REST API gives you complete control. This is the most flexible option but requires the most development work.

Testing Your Agent

Testing the voice agent — having a real conversation to verify behavior

What to Test

Accuracy — ask questions that your knowledge base covers. Does the agent give correct answers?

Personality — does the agent maintain the tone and behavior you defined? Is it too formal? Too casual? Adjust the system prompt until it feels right.

Edge cases — ask questions the agent should not answer. Ask things outside its knowledge. Try to confuse it. A well-configured agent handles these gracefully.

Latency — is the response time fast enough for natural conversation? If there are noticeable delays, try a faster LLM model or check your network connection.

Voice quality — does the voice sound natural? Are there awkward pauses, mispronunciations, or strange inflections? Adjust the voice settings or try a different voice.

Iterating on Your Agent

Your first version will not be perfect, and that is expected. Building a good voice agent is an iterative process:

Test the agent yourself
Note what feels off — tone, accuracy, speed, personality
Adjust the system prompt, knowledge base, or voice settings
Test again
Repeat until the experience feels natural

When your agent handles the common scenarios well, invite a few trusted people to test it. Fresh perspectives reveal issues you might miss because you are too close to the project.

What This Looks Like in Practice

To make this concrete, here is how a product builder might use an ElevenLabs agent in a real product:

Scenario: An online cooking course platform

The agent's name is Chef, and its personality is warm, encouraging, and knowledgeable
When a student opens a recipe, they can click a button to ask Chef questions — "What can I substitute for heavy cream?" or "How do I know when the onions are caramelized?"
Chef answers using the course's recipe database as its knowledge base
If Chef does not know the answer, it suggests the student check the course's community forum

The student gets immediate, spoken answers without leaving the recipe page. The course creator gets a support system that works around the clock without hiring staff.

Cost and Pricing

ElevenLabs uses a credit-based pricing model. Each minute of agent conversation uses a certain number of credits, depending on the voice quality and LLM model you choose.

Check the ElevenLabs pricing page for specific numbers — they update their plans periodically with new features and adjusted limits.

Key Takeaways

Text-to-speech (TTS) converts written text into natural-sounding spoken audio — and modern TTS is nearly indistinguishable from human speech.
ElevenLabs provides AI voice tools including TTS, voice cloning, and conversational AI agents that can have real-time spoken conversations.
Setting up an agent involves creating an account, configuring the agent's personality and voice, adding a knowledge base, and connecting it to your product.
Integration options range from a simple embeddable widget (minutes to set up) to a JavaScript SDK (more control) to a full REST API (maximum flexibility).
Testing is iterative — start simple, test with real conversations, adjust, and repeat until the experience feels natural.
Voice makes products more human — it improves accessibility, engagement, and user satisfaction when implemented thoughtfully.

Key Terms in This Article

Text-to-Speech (TTS): Technology that converts written text into spoken audio. Instead of a human reading words aloud, a computer generates a natural-sounding voice from text input — like having a professional narrator on demand.
ElevenLabs: A company that provides AI-powered voice technology. Their platform offers text-to-speech, voice cloning, and conversational AI agents that can speak and listen in real time.
Conversational AI Agent: A software program that can have spoken conversations with people. It listens to what someone says, understands the meaning, formulates a response, and speaks it back — all in real time, like talking to another person.
API (Application Programming Interface): A way for two software systems to talk to each other. When your application needs to use ElevenLabs voice features, it sends a request through the API and receives audio back. Think of it as a waiter carrying orders between you and the kitchen.
API Key: A unique password that identifies your application to a service like ElevenLabs. It proves you have permission to use the service and tracks your usage. Never share your API key publicly.
Agent ID: A unique identifier assigned to each conversational AI agent you create in ElevenLabs. Your application uses this ID to connect to the specific agent you configured.
Latency: The delay between when you say something and when the AI responds. Lower latency means faster, more natural conversations. High latency creates awkward pauses that make the experience feel robotic.
Voice Cloning: The process of creating a synthetic copy of a specific person's voice. ElevenLabs can analyze a short audio sample and generate new speech that sounds like that person — useful for creating a consistent brand voice.
Knowledge Base: A collection of information that an AI agent can reference when answering questions. You upload documents, FAQs, or product information, and the agent uses this content to give accurate, relevant responses.
WebSocket: A communication channel that stays open between your application and a server, allowing real-time two-way conversation. Unlike regular web requests that open and close, a WebSocket stays connected — essential for live voice conversations.

voiceux

Ten Tips for Using Voice in Your Products

Practical, actionable advice for adding voice to your application the right way — from choosing the right moments to speak, to handling errors gracefully, to respecting your users' preferences.

14 min read

getting-startedtutorial

You Bought the Ship Template — Here Is How to Get Your Code and Start Building

11 min read

What Is Text-to-Speech?

Why TTS Matters for Products

What Is ElevenLabs?

How the Conversational AI Agent Works

Setting Up Your ElevenLabs Account

Step 1: Create Your Account

Step 2: Get Your API Key

Step 3: Explore the Voice Library

Creating Your Conversational AI Agent

Step 1: Create a New Agent

Step 2: Define the Agent's Personality

Step 3: Assign a Voice

Step 4: Add a Knowledge Base

Step 5: Configure Advanced Settings

Connecting the Agent to Your Product

The Widget (Fastest Option)

The JavaScript SDK (More Control)

The REST API (Maximum Flexibility)

Testing Your Agent

What to Test

Iterating on Your Agent

What This Looks Like in Practice

Cost and Pricing

Key Takeaways

Key Terms in This Article

Continue Reading

Ten Tips for Using Voice in Your Products

You Bought the Ship Template — Here Is How to Get Your Code and Start Building

What Is Text-to-Speech?

Why TTS Matters for Products

What Is ElevenLabs?

How the Conversational AI Agent Works

Setting Up Your ElevenLabs Account

Step 1: Create Your Account

Step 2: Get Your API Key

Step 3: Explore the Voice Library

Creating Your Conversational AI Agent

Step 1: Create a New Agent

Step 2: Define the Agent's Personality

Step 3: Assign a Voice

Step 4: Add a Knowledge Base

Step 5: Configure Advanced Settings

Connecting the Agent to Your Product

The Widget (Fastest Option)

The JavaScript SDK (More Control)

The REST API (Maximum Flexibility)

Testing Your Agent

What to Test

Iterating on Your Agent

What This Looks Like in Practice

Cost and Pricing

Key Takeaways

Key Terms in This Article

Continue Reading

Ten Tips for Using Voice in Your Products

You Bought the Ship Template — Here Is How to Get Your Code and Start Building