Select an ASR/TTS provider
You can select ASR and TTS providers when you create a phone channel. Open the ASR tab and select the connection, and then repeat these steps for TTS.
caution
If a specific ASR/TTS provider is selected and incidents occur on their side,
you will need to switch your channel to another provider manually.
You can also keep the Default settings, in which case the configuration of the most stable ASR and TTS providers will be applied. If an incident occurs on the selected provider’s side, the channel will be automatically switched to another.
ASR configuration
You can select one of the connections for ASR and specify additional settings when you create a phone channel.
Connection | Settings | Description |
---|---|---|
Language | The service can recognize speech in multiple languages. You can find the complete list in the Google documentation. | |
Model | One of the machine learning models is used for speech recognition. These models were trained by Google for certain sound types and sources. See the table for the list of models available for each language: • Command and search — use this model to recognize speech in short audio files, such as voice commands. • Default — use this model in all other cases. • Phone call — use this model to recognize speech in phone calls. The model is available only if you use your own ASR connection. | |
Azure | Language | The service can recognize speech in multiple languages. You can find the complete list in the Microsoft documentation. |
TTS configuration
You can select one of the connections for TTS and specify additional settings when you create a phone channel.
Connection | Settings | Description |
---|---|---|
Language | The service can synthesize speech in multiple languages. You can find the complete list in the Google documentation. | |
Voice | You can use multiple voice options in the service (see the Google documentation for the complete list). The following voices are used by default: • en-US-Wavenet-A for English; • cmn-CN-Wavenet-B for Chinese; • Wavenet-A for other languages. | |
Speed | Speech tempo or speed. Here 1 is the normal speed for a specific voice. | |
Voice pitch | Voice pitch. Here 20 is 20 halftones up from the original tone, and -20 means the corresponding decrease. | |
Raise volume | Volume increase in dB relative to the normal volume for a specific voice. When +6.0 dB is selected, playback volume is twice as high as the normal one. We strongly discourage you from exceeding +10.0 dB. | |
Azure | Voice | You can use multiple voice options in the service (see the Microsoft documentation for the complete list). Tovie Platform supports neural voices only. The names for these voices contain the word “neural”. |
tip
Custom voices that you created and trained yourself do not appear in the dropdown list of available voices.
To use them, enter the voice name manually.