![]() ![]() ![]() We are also planning to add phonemes and a new model for stress, as well as to reduce the minimum amount of audio required to train a high-quality voice to 5 - 15 minutes.Īs usual you can try our model in our repo or in colab. Also we are still planning to make our models additional 2-5x faster. ![]() We will be adding as many languages as possible shortly (the CIS languages, English, European languages, Hindic languages). This is a truly break-through achievement for us and we are not planning to stop anytime soon. Models are much more stable - they do not omit words anymore.Sampling rates of 8, 24 or 48 kHz are supported.Pauses, speed and pitch can be controlled via SSML.Input length limitations lifted, now models can work with paragraphs of text.All speakers squeezed into the same model.High quality voice added (and unlimited "random" voices).In our last article we made a bunch of promises about our speech synthesis.Īfter a lot of hard work we finally have delivered upon these promises: ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |