VAD API

Voice Activity Detection (VAD) API reference for sherpa-onnx-node.

Source file

API

Vad

Voice Activity Detector. Detects speech segments in an audio stream.

Constructor

const vad = new sherpa_onnx.Vad(config, bufferSizeInSeconds);

param config: VAD configuration object.
param bufferSizeInSeconds: Buffer size in seconds (number).

The config object has the following properties:

sileroVad (object, optional) — Silero VAD model config:
- model (string) — Path to silero_vad.onnx.
- threshold (number) — Speech threshold, e.g. 0.5.
- minSpeechDuration (number) — Min speech duration in seconds.
- minSilenceDuration (number) — Min silence duration in seconds.
- windowSize (number) — Window size in samples, e.g. 512.
- maxSpeechDuration (number, optional) — Max speech duration in seconds.
tenVad (object, optional) — Ten VAD model config (same fields as sileroVad).
sampleRate (number) — Sample rate in Hz, e.g. 16000.
numThreads (number, optional) — Number of threads.
debug (boolean, optional) — Enable debug output.

Methods

`vad.acceptWaveform(samples)`

Feed audio samples to the VAD.

param samples: Input audio samples (Float32Array).

`vad.isDetected()`

returns: true if speech is currently being detected (boolean).

`vad.isEmpty()`

returns: true if no completed speech segments are available (boolean).

`vad.front(enableExternalBuffer?)`

Get the earliest completed speech segment without removing it.

param enableExternalBuffer: Whether to use an external buffer (boolean, default true).
returns: A SpeechSegment object with start (number) and samples (Float32Array).

`vad.pop()`

Remove the earliest completed speech segment from the queue.

`vad.flush()`

Flush the internal buffer to emit any pending speech segments.

`vad.clear()`

Clear all internal state.

`vad.reset()`

Reset the detector to its initial state.

Properties

vad.config — The configuration object passed to the constructor.

CircularBuffer

A circular buffer that stores Float32 audio samples.

Constructor

const buffer = new sherpa_onnx.CircularBuffer(capacity);

param capacity: Buffer capacity in samples (number).

Methods

`buffer.push(samples)`

Push samples into the buffer.

param samples: Audio samples (Float32Array).

`buffer.get(startIndex, n, enableExternalBuffer?)`

Get a slice of samples from the buffer.

param startIndex: Start index (number).
param n: Number of samples to read (number).
param enableExternalBuffer: Whether to use an external buffer (boolean, default true).
returns: Audio samples (Float32Array).

`buffer.pop(n)`

Remove n samples from the front of the buffer.

param n: Number of samples to remove (number).

`buffer.size()`

returns: Current number of samples in the buffer (number).

`buffer.head()`

returns: The head index of the buffer (number).

`buffer.reset()`

Reset the buffer to empty state.

Example

const sherpa_onnx = require('sherpa-onnx-node');

const vad = new sherpa_onnx.Vad({
  sileroVad: { model: './silero_vad.onnx', threshold: 0.5,
    minSpeechDuration: 0.25, minSilenceDuration: 0.5, windowSize: 512 },
  sampleRate: 16000, debug: false, numThreads: 1,
}, 60);

// Feed audio in chunks of windowSize samples
vad.acceptWaveform(samples);

if (vad.isDetected()) {
  console.log('Speech detected');
}

while (!vad.isEmpty()) {
  const segment = vad.front();
  vad.pop();
  console.log(`Segment: ${segment.samples.length} samples`);
}

Notes

Feed audio in chunks matching windowSize (512 for Silero VAD at 16kHz).
isDetected() returns true while speech is ongoing.
isEmpty() / front() / pop() are used to extract completed speech segments (speech followed by enough silence).
Use flush() at the end of a stream to emit any remaining buffered speech.
You can use ten-vad.onnx instead of silero_vad.onnx by setting the tenVad config field and leaving sileroVad.model empty.

VAD API

Source file

API

Vad

Constructor

Methods

vad.acceptWaveform(samples)

vad.isDetected()

vad.isEmpty()

vad.front(enableExternalBuffer?)

vad.pop()

vad.flush()

vad.clear()

vad.reset()

Properties

CircularBuffer

Constructor

Methods

buffer.push(samples)

buffer.get(startIndex, n, enableExternalBuffer?)

buffer.pop(n)

buffer.size()

buffer.head()

buffer.reset()

Example

Notes

`vad.acceptWaveform(samples)`

`vad.isDetected()`

`vad.isEmpty()`

`vad.front(enableExternalBuffer?)`

`vad.pop()`

`vad.flush()`

`vad.clear()`

`vad.reset()`

`buffer.push(samples)`

`buffer.get(startIndex, n, enableExternalBuffer?)`

`buffer.pop(n)`

`buffer.size()`

`buffer.head()`

`buffer.reset()`