sherpa-onnx C API 1.0
Public C API and C++ wrapper for sherpa-onnx
Loading...
Searching...
No Matches
Voice Activity Detection (VAD)

Table of Contents

sherpa-onnx supports two VAD model families through the SherpaOnnxCreateVoiceActivityDetector() API. Configure exactly one by filling in the corresponding field of SherpaOnnxVadModelConfig.

See also
SherpaOnnxCreateVoiceActivityDetector, SherpaOnnxVadModelConfig

Silero VAD

Silero VAD is a widely used voice activity detection model. The recommended window size is 512 samples.

memset(&config, 0, sizeof(config));
config.silero_vad.model = "./silero_vad.onnx";
config.silero_vad.threshold = 0.25f;
config.silero_vad.window_size = 512;
config.sample_rate = 16000;
config.num_threads = 1;
const SherpaOnnxVoiceActivityDetector * SherpaOnnxCreateVoiceActivityDetector(const SherpaOnnxVadModelConfig *config, float buffer_size_in_seconds)
Create a voice activity detector.
struct SherpaOnnxVoiceActivityDetector SherpaOnnxVoiceActivityDetector
Opaque voice activity detector handle.
Definition c-api.h:2082
Configuration shared by voice activity detectors.
Definition c-api.h:1947
SherpaOnnxSileroVadModelConfig silero_vad
Definition c-api.h:1949

Model file: silero_vad.onnx

Example source: vad-whisper-c-api.c

Ten VAD

Ten VAD is an alternative VAD model. The recommended window size is 256 samples.

memset(&config, 0, sizeof(config));
config.ten_vad.model = "./ten-vad.onnx";
config.ten_vad.threshold = 0.25f;
config.ten_vad.window_size = 256;
config.sample_rate = 16000;
config.num_threads = 1;
SherpaOnnxTenVadModelConfig ten_vad
Definition c-api.h:1959

Model file: ten-vad.onnx

Example source: vad-whisper-c-api.c