Gain Greater Control with VisionStory’s New “Preview Audio” and “Pause” Features

Jan 11, 2025

Preview audio and pause features for video content creation

Creating high-quality video content often depends on the finer details—such as the way a word is pronounced or the timing of a pause. At VisionStory, we’re excited to introduce two powerful new features—Preview Audio and Pause—giving you more precision and flexibility before you use your credits to generate a full video.

Why Use Preview Audio?

Preview Audio is a breakthrough for anyone who wants to ensure their text-to-speech (TTS) narration sounds just right before spending credits on video generation. Previously, you would move directly from writing your script to creating the final video. While this was quick, it left little room for fine-tuning—and any small mistake meant credits were already used. With Preview Audio, you can:

  1. Check Pronunciation & Tone
    Listen to the entire audio generated from your text and confirm it matches your intended style.
  2. Save Credits
    Spotting an error in the audio before generating the video helps you avoid unnecessary credit usage.
  3. Avoid Streaming Artifacts
    When audio is generated in real time to sync with video (a “streaming pipeline”), some AI voices may have slight volume inconsistencies at the start or end. By using Preview Audio first, you can avoid these issues and achieve a more polished result.

Common Challenges & Text Tips: While TTS technology has advanced, some complexities can still cause issues. Pay special attention to:

  • Specialised or Technical Terms: Medical, legal, or scientific words may need extra punctuation or spelling tweaks.
  • Abbreviations: Make sure TTS expands or pronounces them as you expect.
  • Currencies & Numbers: The narrator might read numbers in an unexpected way or skip currency symbols.
  • Heavy Punctuation: Full stops, commas, and colons can affect how TTS handles intonation and pacing.

If you notice any issues, simply edit your text, run Preview Audio again, and confirm it’s perfect before clicking “Generate Talking Video.”

Introducing the Pause Feature

Sometimes you want to slow down for dramatic effect, highlight a phrase, or handle tricky words with care. Our new Pause feature—accessible via the “⏱ +0.5” icon—lets you insert a short pause anywhere in your script. For a longer pause, just add multiple pause icons. This manual pause can:

  • Improve Clarity: Break up long sentences so your audience can easily follow each part.
  • Enhance Emphasis: Build anticipation before a key statement or punchline.
  • Override Default TTS Pausing: If the TTS engine doesn’t pause where you want—or adds an unwanted break—manually adding pauses ensures your narration flows exactly as you intend.

Important Tips

Preview Audio uses a character-based quota, which resets daily according to your subscription plan. As a general guide, 1 minute of speech is about 1,000 characters:

  • Pro Plan: 10,000 characters (~10 minutes of audio per day)
  • Advanced Plan: 50,000 characters (~50 minutes of audio per day)
  • Ultra Plan: 100,000 characters (~100 minutes of audio per day)

Tips for the Stopwatch (Pause) Feature:

  • Each stopwatch icon adds a 0.5-second pause. You can use them consecutively for longer pauses, up to a maximum of 3 seconds.
  • Note: Avoid using more than two consecutive pauses in a single text segment, as this may cause the AI to produce unexpected sounds or artefacts.

Use Cases & Real-World Benefits

  • Marketing & Advertising
    Marketers often use short, impactful lines followed by a well-timed pause to spark curiosity. Now you can perfect your brand messaging and preview different deliveries without wasting credits.
  • E-Learning & Instructional Videos
    Educational content often includes complex terms or acronyms. Quickly preview how they’re spoken, add the right pauses, and ensure learners can easily follow along.
  • Storytelling & Narration
    Dramatic voiceovers rely on precise pacing. A well-placed pause can create suspense or emotional impact—something auto-generated TTS pacing may not always achieve.
  • Professional Presentations
    When presenting—such as in financial reviews or business pitches—mispronounced names or numbers can affect credibility. Previewing and adding pauses helps ensure a smooth, professional narration.