sherpa-onnx supports two VAD model families through the SherpaOnnxCreateVoiceActivityDetector() API. Configure exactly one by filling in the corresponding field of SherpaOnnxVadModelConfig.
- See also
- SherpaOnnxCreateVoiceActivityDetector, SherpaOnnxVadModelConfig
Silero VAD
Silero VAD is a widely used voice activity detection model. The recommended window size is 512 samples.
memset(&config, 0, sizeof(config));
const SherpaOnnxVoiceActivityDetector * SherpaOnnxCreateVoiceActivityDetector(const SherpaOnnxVadModelConfig *config, float buffer_size_in_seconds)
Create a voice activity detector.
struct SherpaOnnxVoiceActivityDetector SherpaOnnxVoiceActivityDetector
Opaque voice activity detector handle.
float min_silence_duration
float max_speech_duration
float min_speech_duration
Configuration shared by voice activity detectors.
SherpaOnnxSileroVadModelConfig silero_vad
Model file: silero_vad.onnx
Example source: vad-whisper-c-api.c
Ten VAD
Ten VAD is an alternative VAD model. The recommended window size is 256 samples.
memset(&config, 0, sizeof(config));
float min_silence_duration
float max_speech_duration
float min_speech_duration
SherpaOnnxTenVadModelConfig ten_vad
Model file: ten-vad.onnx
Example source: vad-whisper-c-api.c