Pre-trained models
Two kinds of end-to-end (E2E) models are supported by k2-fsa/sherpa:
CTC
Transducer
Hint
For transducer-based models, we only support stateless transducers. To the best of our knowledge, only icefall supports that. In other words, only transducer models from icefall are currently supported.
For CTC-based models, we support any type of models trained using CTC loss as long as you can export the model via torchscript. Models from the following frameworks are currently supported: icefall, WeNet, and torchaudio (Wav2Vec 2.0). If you have a CTC model and want it to be supported in k2-fsa/sherpa, please create an issue at https://github.com/k2-fsa/sherpa/issues.
Hint
You can try the pre-trained models in your browser without installing anything. See https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition.
This page lists all available pre-trained models that you can download.
Hint
We provide pre-trained models for the following languages:
Arabic
Chinese
English
German
Tibetan
Hint
We provide a colab notebook
for you to try offline recognition step by step.
It shows how to install sherpa and use it as offline recognizer, which supports the models from icefall, the WeNet framework and torchaudio.
Pretrained models
- Offline CTC models
- icefall
- icefall-asr-gigaspeech-conformer-ctc (English)
- icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09 (English)
- icefall-asr-tedlium3-conformer-ctc2 (English)
- icefall_asr_librispeech_conformer_ctc (English)
- icefall_asr_aishell_conformer_ctc (Chinese)
- icefall-asr-mgb2-conformer_ctc-2022-27-06 (Arabic)
- WeNet
- torchaudio
- NeMo
- sherpa-nemo-ctc-en-citrinet-512 (English)
- sherpa-nemo-ctc-zh-citrinet-512 (Chinese)
- sherpa-nemo-ctc-zh-citrinet-1024-gamma-0-25 (Chinese)
- sherpa-nemo-ctc-de-citrinet-1024 (German)
- sherpa-nemo-ctc-en-conformer-small (English)
- sherpa-nemo-ctc-en-conformer-medium (English)
- sherpa-nemo-ctc-en-conformer-large (English)
- sherpa-nemo-ctc-de-conformer-large (German)
- How to convert NeMo models to sherpa
- icefall
- Offline transducer models
- icefall
- English
- icefall-asr-librispeech-zipformer-2023-05-15
- icefall-asr-librispeech-zipformer-small-2023-05-16
- icefall-asr-librispeech-zipformer-large-2023-05-16
- icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04
- icefall-asr-librispeech-pruned-transducer-stateless8-2022-12-02
- icefall-asr-librispeech-pruned-transducer-stateless8-2022-11-14
- icefall-asr-librispeech-pruned-transducer-stateless7-2022-11-11
- icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13
- icefall-asr-gigaspeech-pruned-transducer-stateless2
- Chinese
- Chinese + English
- Tibetan
- English
- icefall
- Online transducer models
- icefall
- English
- icefall-asr-librispeech-streaming-zipformer-2023-05-17
- icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29
- icefall-asr-librispeech-conv-emformer-transducer-stateless2-2022-07-05
- icefall-asr-librispeech-lstm-transducer-stateless2-2022-09-03
- icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01
- icefall_librispeech_streaming_pruned_transducer_stateless4_20220625
- Chinese
- Chinese + English (all-in-one)
- English
- icefall