Audio & Voice

Which languages are supported on VisionStory?
VisionStory supports over 30 major languages from around the world, including English, Chinese, Spanish, Arabic, Portuguese, Russian, Japanese, Punjabi, German, French, Korean, Turkish, Tamil, Vietnamese, Hindi, Bengali, Urdu, Persian, Italian, Indonesian, Thai, Marathi, Telugu, Ukrainian, Malay, Romanian, Polish, Dutch, Gujarati, and Kannada, among others.
How many voices are available in VisionStory's voice library, and can I customise them?
VisionStory provides a library of over 200 voices, which you can filter by gender, age, and use case. If you do not find a suitable voice, you can also create your own custom AI voice clone by uploading or recording audio.
Why are there fewer voice options available in my language?
Some languages have fewer voice options because those voices are specially optimised for that language. However, many English voices can also speak multiple languages, so you still have flexibility in choosing a voice that suits your needs.
What is voice cloning, and how can I clone a voice?
Voice cloning lets you create a personalised AI voice that imitates a particular voice by uploading or recording audio. To clone a voice, make sure your audio is recorded clearly in a quiet setting for the best results.
Is voice cloning free to use?
To use the voice cloning feature in video generation, you need to have a Pro plan or a higher subscription.
How many languages does voice cloning support?
Voice cloning is freely supported in over 32 languages. The list of supported languages may change, so please check the voice cloning feature for the latest options. Please note: while cloning is free, you will need a subscription to use the cloned voice in video creation.
What is preview audio, and what are its benefits?
Preview audio lets you generate and listen to the speech for your talking video before creating the final video. This helps you check the voice, pronunciation, and pauses to make sure everything sounds right. You can adjust the voice as needed before using credits to generate the video. For all subscribers, preview audio is free up to a daily quota, which resets every day. If you use up your daily preview quota, you can buy more using credits.
What does the stopwatch icon and +0.5s indicate?
The stopwatch icon and +0.5s feature let you add a 0.5-second pause in the generated voice. You can use multiple stopwatch icons one after another to create longer pauses as required in your video.
What is URL import, and which URLs are supported?
URL import lets you bring in audio from a link by downloading and extracting the audio from the given URL for use in video creation. At present, it supports links from YouTube and TikTok. If you wish to have support for more websites, please let us know. You can also use the voice changer feature to modify the imported audio while retaining the original content.
What is the remove noise feature?
The remove noise feature helps you get rid of background noise from your audio when you import or record it, so your videos have clearer sound quality. To use this feature, you need to be on the Pro Plan or a higher subscription.
What is the voice changer feature?
The voice changer feature lets you alter the voice in an audio clip, so you can create unique versions of the speech while keeping the original content intact. To use this feature, you need to be on the Pro Plan or above.
Can I control the emotion of the voice?
The emotion in the voice is determined by the text you enter. When you use different words or phrases, the text-to-speech (TTS) system automatically adds the suitable emotion, so you do not need to do anything extra to control it.
What should I remember while using the stopwatch (pause) feature?
When you use the stopwatch feature, each stopwatch adds a 0.5-second pause. You can use them one after another to create longer pauses, up to a maximum of 3 seconds. However, it is best not to use more than two pauses in a row within a single text segment, as this might cause the AI to generate unexpected sounds or glitches.