VisionStory supports over 30 major languages from around the world, including English, Chinese, Spanish, Arabic, Portuguese, Russian, Japanese, Punjabi, German, French, Korean, Turkish, Tamil, Vietnamese, Hindi, Bengali, Urdu, Persian, Italian, Indonesian, Thai, Marathi, Telugu, Ukrainian, Malay, Romanian, Polish, Dutch, Gujarati, and Kannada, among others.
How many voices are available in VisionStory's voice library, and can I customise them?
VisionStory provides a library of over 200 voices, which you can filter by gender, age, and use case. If you do not find a suitable voice, you can also create your own custom AI voice clone by uploading or recording audio.
Why are there fewer voice options available in my language?
Some languages have fewer voice options because those voices are specially optimised for that language. However, many English voices can also speak multiple languages, so you still have flexibility in choosing a voice that suits your needs.
What is voice cloning, and how can I clone a voice?
Voice cloning lets you create a personalised AI voice that imitates a particular voice by uploading or recording audio. To clone a voice, make sure your audio is recorded clearly in a quiet setting for the best results.
Is voice cloning free of cost?
Voice cloning is free for English, Spanish, Japanese, and Chinese, so you can check if the cloned voice matches your own. However, to actually use the cloned voice in video creation, you need to subscribe to the Pro Plan or a higher plan. For voice cloning in any other language, a Pro Plan or above is also required.
How many languages does voice cloning support?
Voice cloning is freely available in four languages: English, Spanish, Japanese, and Chinese. For other languages, you will need a Pro Plan or higher. The list of supported languages may change, so please check the voice cloning feature for the latest options.
What is preview audio, and what are its benefits?
Preview audio lets you generate and listen to the speech for your talking video before creating the final video. This helps you check the voice, pronunciation, and pauses to make sure they are as you want. You can adjust the voice as needed before using credits to generate the video. To use preview audio, you need to subscribe to the Pro Plan or above, and each plan offers a different preview quota.
What does the stopwatch icon and +0.5s indicate?
The stopwatch icon and +0.5s feature let you add a 0.5-second pause in the generated voice. You can use multiple stopwatch icons one after another to create longer pauses as required in your video.
What is URL import, and which URLs are supported?
URL import lets you bring in audio from a link by downloading and extracting the audio from the given URL for use in video creation. At present, it supports links from YouTube and TikTok. If you wish to have support for more websites, please let us know. You can also use the voice changer feature to modify the imported audio while retaining the original content.
What is the remove noise feature?
The remove noise feature helps you get rid of background noise from your audio when you import or record it, so your videos have clearer sound quality. To use this feature, you need to be on the Pro Plan or a higher subscription.
What is the voice changer feature?
The voice changer feature lets you alter the voice in an audio clip, so you can create unique versions of the speech while keeping the original content intact. To use this feature, you need to be on the Pro Plan or above.
Can I control the emotion of the voice?
The emotion in the voice is determined by the text you enter. When you use different words or phrases, the text-to-speech (TTS) system automatically adds the suitable emotion, so you do not need to do anything extra to control it.
What should I remember while using the stopwatch (pause) feature?
When you use the stopwatch feature, each stopwatch adds a 0.5-second pause. You can use them one after another to create longer pauses, up to a maximum of 3 seconds. However, it is best not to use more than two pauses in a row within a single text segment, as this might cause the AI to generate unexpected sounds or glitches.