Generation-time parameters shared by advanced TTS APIs.
More...
#include <c-api.h>
This struct supports both simple multi-speaker synthesis and more advanced zero-shot or reference-conditioned models.
Example for Pocket TTS:
memset(&cfg, 0, sizeof(cfg));
cfg.
extra =
"{\"max_reference_audio_len\": 10.0, \"seed\": 42}";
Generation-time parameters shared by advanced TTS APIs.
const float * reference_audio
int32_t reference_audio_len
int32_t reference_sample_rate
Definition at line 2679 of file c-api.h.
◆ extra
| const char* SherpaOnnxGenerationConfig::extra |
Optional model-specific JSON string with extra key/value pairs.
Definition at line 2697 of file c-api.h.
◆ num_steps
| int32_t SherpaOnnxGenerationConfig::num_steps |
Optional number of flow-matching steps.
Definition at line 2695 of file c-api.h.
◆ reference_audio
| const float* SherpaOnnxGenerationConfig::reference_audio |
Optional reference audio for zero-shot or voice-cloning models.
Definition at line 2687 of file c-api.h.
◆ reference_audio_len
| int32_t SherpaOnnxGenerationConfig::reference_audio_len |
Length of reference_audio in samples.
Definition at line 2689 of file c-api.h.
◆ reference_sample_rate
| int32_t SherpaOnnxGenerationConfig::reference_sample_rate |
Sample rate of reference_audio.
Definition at line 2691 of file c-api.h.
◆ reference_text
| const char* SherpaOnnxGenerationConfig::reference_text |
Optional reference text associated with reference_audio.
Definition at line 2693 of file c-api.h.
◆ sid
| int32_t SherpaOnnxGenerationConfig::sid |
Speaker ID for multi-speaker models.
Definition at line 2685 of file c-api.h.
◆ silence_scale
| float SherpaOnnxGenerationConfig::silence_scale |
Silence scale between sentences.
Definition at line 2681 of file c-api.h.
◆ speed
| float SherpaOnnxGenerationConfig::speed |
Speech rate. Used only by models that support it.
Definition at line 2683 of file c-api.h.
The documentation for this struct was generated from the following file: