Pre-trained models

The following table lists links for all pre-trained models.

Description

URL

Speech recognition (speech to text, ASR)

https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models

Text to speech (TTS)

https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models

VAD

https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx

Keyword spotting

https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models

Speech identification (Speaker ID)

https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models

Spoken language identification (Language ID)

https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models (multi-lingual whisper)

Audio tagging

https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models

Punctuation

https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models

In this section, we describe how to download and use all available pre-trained models for speech recognition.

Hint

Please install git-lfs before you continue.

Otherwise, you will be SAD later.