Now that you have explored the tools for generating podcast show notes and timestamps, this tutorial picks up where that exploration left off.

Generate a Voiceover With ElevenLabs From Your Script

beginner671 reads

Bookmark

Turn any written script into a realistic AI narration in minutes using ElevenLabs. No microphone, no recording, no editing audio.

Prerequisites

ElevenLabs account (free tier works for most creators)
A finished script (plain text)

Step 1: Choose your voice

Gender and age (young male, middle-aged female, etc.)
Accent (American, British, Australian)
Use case (narration, news, conversational, character)

Click the play icon to preview each voice on the sample text. For educational or documentary-style content, look for voices tagged "narration." For social content, look for "conversational."

Tip: Listen to at least 10 voices before choosing. The differences are subtle but matter at scale.

Step 2: Paste your script and preview

Click Text to Speech in the left nav. Select your chosen voice from the dropdown. Paste your script into the text box.

Before generating the full audio, click Generate on a short paragraph (2-3 sentences) to preview pacing and tone. Check:

Is the pacing too fast or slow?
Are technical words pronounced correctly?
Does the emphasis fall on the right words?

Step 3: Adjust settings

Below the text box, you'll find:

Stability: Higher = more consistent but robotic. Lower = more expressive but variable. Start at 50% and adjust.
Similarity: How closely it adheres to the voice character. Start at 75%.
Style Exaggeration: Adds expressiveness. Use 0-20% for narration; higher for character voices.

For most narration, default settings are fine. Adjust only if the preview sounds off.

Step 4: Add pronunciation corrections

ElevenLabs sometimes mispronounces brand names, acronyms, or technical terms. Fix these in Settings → Pronunciation Dictionary. Add custom pronunciations using IPA notation or alternate spelling. Example: "Descript" might be read as "de-SCRIPT" instead of "dee-script."

For one-off fixes without a dictionary entry, you can write out the phonetic spelling directly in the script: "Deck-script (Descript)" — the AI will read the phonetic version.

Step 5: Generate and download

Once you're happy with the preview, generate the full audio. Download as MP3 or WAV. WAV for video editing; MP3 for podcast publishing.

Step 6: Sync audio to video in your editor

Import the MP3 or WAV into your video editor (Premiere, Final Cut, Descript, CapCut). Sync the audio track to your video timeline. If you recorded yourself on camera with a different voice, cut the recording and use the AI audio instead. If it's a fully scripted faceless video, the AI audio IS your audio track.

Voice Cloning (Optional)

To clone your own voice:

Go to Voices → Add Voice → Instant Voice Clone
Upload 3-10 minutes of clean, single-speaker audio (no background noise, music, or other speakers)
Name the voice and save it
Generate audio the same way—except now it sounds like you

Clone quality improves significantly with more training audio. 30+ minutes produces very accurate results.

Free Tier Limits

The free tier gives you 10,000 characters/month (~8-10 minutes of audio). For most solo creators just starting, this covers 1-2 videos per month. The Starter plan ($5/month) gives 30,000 characters—enough for 5-8 videos.

Discussion

Loading…

← Back to learning path