Spoken Language Identification API

Spoken language identification API reference for sherpa-onnx-node.

Source file

scripts/node-addon-api/lib/spoken-language-identification.js

API

SpokenLanguageIdentification

Identifies the spoken language in an audio recording.

Constructor

const sli = new sherpa_onnx.SpokenLanguageIdentification(config);
param config

Configuration object with:

  • whisper (object, optional) — Whisper model configuration:

    • encoder (string) — Path to the Whisper encoder ONNX model.

    • decoder (string) — Path to the Whisper decoder ONNX model.

    • tailPaddings (number, optional) — Number of tail padding samples.

  • numThreads (number, optional).

  • debug (boolean, optional).

  • provider (string, optional).

Methods

sli.createStream()
returns

A new OfflineStream for feeding audio.

sli.compute(stream)

Identify the spoken language.

param stream

An OfflineStream.

returns

A two-letter language code (string), e.g. 'en', 'de', 'fr', 'es', 'zh', 'ja', 'ko'.

Properties

  • sli.config — The configuration object.

Example

const sherpa_onnx = require('sherpa-onnx-node');

const sli = new sherpa_onnx.SpokenLanguageIdentification({
  whisper: {
    encoder: './whisper-encoder.onnx',
    decoder: './whisper-decoder.onnx',
  },
});

const stream = sli.createStream();
const wave = sherpa_onnx.readWave('./audio.wav');
stream.acceptWaveform({ samples: wave.samples, sampleRate: wave.sampleRate });

const lang = sli.compute(stream);
console.log(`Detected language: ${lang}`);

Notes

  • Uses a Whisper-based model for language identification.

  • The input audio should be mono, 16kHz, float32 in [-1, 1].

  • Supported languages depend on the Whisper model variant used.