Озвучивать текст SpeechKit Yandex API v3: Synthesizer (gRPC)
Набор методов синтеза голоса.
| Вызов | Описание |
|---|---|
| UtteranceSynthesis | Синтезирование текста в речь. |
UtteranceSynthesis
Синтезирование текста в речь.
rpc UtteranceSynthesis (UtteranceSynthesisRequest) returns (stream UtteranceSynthesisResponse)
UtteranceSynthesisRequest
| Поле | Описание |
|---|---|
| model | string Имя модели. Определяет базовую функциональность синтеза. На данный момент должно быть пусто. Не используйте его.. |
| Utterance | oneof: text or text_templateТекст для синтеза, одна из разметок синтеза текста. |
| text | string Raw text (e.g. «Hello, Alice»). |
| text_template | TextTemplate Text template instance, e.g. {"Hello, {username}" with username="Alice"}. |
| hints[] | Hints Опционально настройки синтеза речи. |
| output_audio_spec | AudioFormatOptions Опционально. По умолчанию: 22050 Гц, линейный 16-битный PCM с прямым порядком байтов со знаком, с заголовком WAV. |
| loudness_normalization_type | enum LoudnessNormalizationType Указывает тип нормализации громкости. Необязательный. По умолчанию: LUFS.
|
| unsafe_mode | bool Опционально. Автоматически разделяйте длинный текст на несколько высказываний и выставляйте соответствующие счета. Возможно некоторое ухудшение качества обслуживания. |
TextTemplate
| Поле | Описание |
|---|---|
| text_template | string Template text. Sample: The {animal} goes to the {place}. |
| variables[] | TextVariable Defining variables in template text. Sample: {animal: cat, place: forest} |
TextVariable
| Поле | Описание |
|---|---|
| variable_name | string The name of the variable. |
| variable_value | string The text of the variable. |
Hints
| Поле | Описание |
|---|---|
| Hint | oneof: voice, audio_template, speed, volume, role, pitch_shift or durationThe hint for TTS engine to specify synthesised audio characteristics. |
| voice | string Name of speaker to use. |
| audio_template | AudioTemplate Template for synthesizing. |
| speed | double Hint to change speed. |
| volume | double Hint to regulate normalization level.
|
| role | string Hint to specify pronunciation character for the speaker. |
| pitch_shift | double Hint to increase (or decrease) speaker’s pitch, measured in Hz. Valid values are in range [-1000;1000], default value is 0. |
| duration | DurationHint Hint to limit both minimum and maximum audio duration. |
AudioTemplate
| Поле | Описание |
|---|---|
| audio | AudioContent Audio file. |
| text_template | TextTemplate Template and description of its variables. |
| variables[] | AudioVariable Describing variables in audio. |
AudioContent
| Поле | Description |
|---|---|
| AudioSource | oneof: contentThe audio source to read the data from. |
| content | bytes Bytes with audio data. |
| audio_spec | AudioFormatOptions Description of the audio format. |
AudioVariable
| Поле | Описание |
|---|---|
| variable_name | string The name of the variable. |
| variable_start_ms | int64 Start time of the variable in milliseconds. |
| variable_length_ms | int64 Length of the variable in milliseconds. |
DurationHint
| Поле | Описание |
|---|---|
| policy | enum DurationHintPolicy Type of duration constraint.
|
| duration_ms | int64 Constraint on audio duration in milliseconds. |
AudioFormatOptions
| Поле | Описание |
|---|---|
| AudioFormat | oneof: raw_audio or container_audio |
| raw_audio | RawAudio The audio format specified in request parameters. |
| container_audio | ContainerAudio The audio format specified inside the container metadata. |
RawAudio
| Поле | Описание |
|---|---|
| audio_encoding | enum AudioEncoding Encoding type.
|
| sample_rate_hertz | int64 Sampling frequency of the signal. |
ContainerAudio
| Поле | Описание |
|---|---|
| container_audio_type | enum ContainerAudioType
|
UtteranceSynthesisResponse
| Поле | Описание |
|---|---|
| audio_chunk | AudioChunk Part of synthesized audio. |
| text_chunk | TextChunk Part of synthesized text. |
| start_ms | int64 Start time of the audio chunk in milliseconds. |
| length_ms | int64 Length of the audio chunk in milliseconds. |
AudioChunk
| Поле | Описание |
|---|---|
| data | bytes Sequence of bytes of the synthesized audio in format specified in output_audio_spec. |
TextChunk
| Поле | Описание |
|---|---|
| text | string Synthesized text. |
What's your reaction?
Excited
0Happy
0In Love
0Not Sure
0Silly
0









