Gain Greater Control with Our New “Preview Audio” and “Pause” Features

Jan 11, 2025

Preview audio and pause features for video content creation

Producing high-quality video content often depends on the finer details—such as the pronunciation of a word or the timing of a dramatic pause. We’re delighted to introduce two new features—Preview Audio and Pause—that provide you with enhanced precision and flexibility before you commit credits to generating a full video.

Why Use Preview Audio?

Preview Audio is a breakthrough for anyone wanting to ensure their text-to-speech (TTS) narration sounds just right before using credits to create a video. Previously, you would move directly from writing your script to generating the final video. While this was convenient, it left little room for fine-tuning—and if you noticed a minor error, you’d already have spent your credits. With Preview Audio, you can:

  1. Check Pronunciation & Tone
    Listen to the entire audio generated from your text and confirm it matches your intended style.
  2. Save Credits
    Spotting an error in the audio before rendering a video helps you avoid unnecessary credit usage.
  3. Avoid Streaming Artifacts
    When audio is generated in real time to sync with video (a “streaming pipeline”), some AI voices may have slight volume inconsistencies at the start and end. By using Preview Audio first, you can avoid these artefacts and achieve a more polished result.

Common Pitfalls & Text Considerations: While TTS technology has advanced significantly, certain complexities can still present challenges. Pay particular attention to:

  • Specialised or Technical Terms: Medical, legal, or scientific jargon may require extra punctuation or spelling adjustments.
  • Abbreviations: Ensure TTS expands or pronounces them correctly.
  • Currencies & Numbers: The narrator may read numbers in an unexpected way or skip over currency symbols.
  • Heavy Punctuation: Full stops, commas, and colons can affect how TTS handles intonation and pacing.

If you notice any issues, simply revise your text, run Preview Audio again, and confirm it’s perfect before clicking “Generate Talking Video”.

Introducing the Pause Feature

Sometimes you want to slow things down for dramatic effect, emphasise a phrase, or handle tricky words with precision. Our new Pause option—accessible via the “⏱ +0.5” icon—lets you insert a short break anywhere in your script. If you need a longer pause, simply add multiple pause icons in your text. This manual pause can:

  • Improve Clarity: Break up long sentences so the listener can clearly understand each part.
  • Enhance Emphasis: Build anticipation before a key statement or punchline.
  • Override Default TTS Pausing: If the text-to-speech engine doesn’t pause where you want—or adds an unintended break—manually adding pauses ensures the final narration flows as you intend.

Important Tips

Preview Audio uses a character-based quota, which resets monthly according to your subscription plan. As a general guide, 1 minute of speech is roughly 1,000 characters:

  • Pro: 10,000 characters (~10 minutes of audio)
  • Advanced: 50,000 characters (~50 minutes of audio)
  • Ultra: 100,000 characters (~100 minutes of audio)

Tips for the Stopwatch Feature:

  • Each stopwatch icon represents a 0.5-second pause, and you can use them consecutively to create longer pauses, up to a maximum of 3 seconds.
  • Reminder: Avoid using more than two consecutive pauses within a single text segment, as this may cause the AI to produce unexpected sounds or artefacts.

Use Cases & Real-World Benefits

  • Marketing & Advertising
    Marketers often use short, impactful lines followed by a well-timed pause to spark curiosity. Now you can refine your brand messaging and preview different deliveries without wasting credits.
  • E-Learning & Instructional Videos
    Complex terminology or acronyms are common in educational content. Quickly preview how they’re pronounced, insert the right pauses, and ensure learners can easily follow along.
  • Storytelling & Narration
    Dramatic voiceovers rely on precise pacing. A well-placed pause can convey suspense or emotional nuance—something the default pacing of TTS may not always achieve.
  • Professional Presentations
    When you need to make a point—such as in financial reviews or corporate pitches—mispronounced names or numbers can undermine your credibility. Previewing and adding pauses helps ensure a smooth, professional vocal track.