sherpa-onnx
Hint
During speech recognition, it does not need to access the Internet. Everyting is processed locally on your device.
We support using onnx with onnxruntime to replace PyTorch for neural network computation. The code is put in a separate repository sherpa-onnx.
sherpa-onnx is self-contained and everything can be compiled from source.
Please refer to https://k2-fsa.github.io/icefall/model-export/export-onnx.html for how to export models to onnx format.
In the following, we describe how to build sherpa-onnx for Linux, macOS, Windows, embedded systems, Android, and iOS.
Also, we show how to use it for speech recognition with pre-trained models.
- Tutorials
- 中文资料 (Chinese tutorials)
- 2024-10-09【基于sherpa的本地智能语音助手入门-Java Api版
- 2024-07-03【🆓 語音辨識引擎sherpa-onnx CPU上篇】讓您輕鬆體驗語音辨識功能(Docker架設)
- 2024-06-10 SherpaOnnxTtsEngine - Android 本地 TTS 语言转文本引擎
- 2024-06-10 用LLM搭建100个应用(一):从0到1搭建自己的Windows贾维斯(1)
- 2024-05-09 记录一下sherpa-onnx的安装及使用
- 2024-04-09 rv1106&rv1109&rv1126移植sherpa-onnx 实现离线TTS功能
- 2023-08-08 snowboy+新一代kaldi(k2-fsa)sherpa-onnx实现离线语音识别【语音助手】
- 2023-03-16 k2语音识别:如何使用sherpa-onnx
- 中文资料 (Chinese tutorials)
- Installation
- Frequently Asked Question (FAQs)
- 在线、离线、流式、非流式的区别
- Cannot open shared library libasound_module_conf_pulse.so
- TTS 中文模型没有声音
- ./gitcompile: line 89: libtoolize: command not found
- OSError: PortAudio library not found
- imports github.com/k2-fsa/sherpa-onnx-go-linux: build constraints exclude all Go files
- External buffers are not allowed
- The given version [17] is not supported, only version 1 to 10 is supported in this build
- Python
- C API
- Java API
- Javascript API
- Kotlin API
- Swift API
- Go API
- C# API
- Pascal API
- Lazarus
- WebAssembly
- Android
- iOS
- Flutter
- WebSocket
- Hotwords (Contextual biasing)
- Keyword spotting
- Punctuation
- Audio tagging
- Spoken language identification
- VAD
- Pre-trained models
- Online transducer models
- Zipformer-transducer-based Models
- sherpa-onnx-streaming-zipformer-korean-2024-06-16 (Korean)
- sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12 (Chinese)
- k2-fsa/icefall-asr-zipformer-wenetspeech-streaming-small (Chinese)
- k2-fsa/icefall-asr-zipformer-wenetspeech-streaming-large (Chinese)
- pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615 (Chinese)
- csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-06-26 (English)
- csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-06-21 (English)
- csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21 (English)
- csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English)
- shaojieli/sherpa-onnx-streaming-zipformer-fr-2023-04-14 (French)
- sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16 (Bilingual, Chinese + English)
- csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23 (Chinese)
- csukuangfj/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17 (English)
- Conformer-transducer-based Models
- LSTM-transducer-based Models
- Zipformer-transducer-based Models
- Online paraformer models
- Online CTC models
- Offline transducer models
- Zipformer-transducer-based Models
- sherpa-onnx-zipformer-ru-2024-09-18 (Russian, 俄语)
- sherpa-onnx-small-zipformer-ru-2024-09-18 (Russian, 俄语)
- sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01 (Japanese, 日语)
- sherpa-onnx-zipformer-korean-2024-06-24 (Korean, 韩语)
- sherpa-onnx-zipformer-thai-2024-06-20 (Thai, 泰语)
- sherpa-onnx-zipformer-cantonese-2024-03-13 (Cantonese, 粤语)
- sherpa-onnx-zipformer-gigaspeech-2023-12-12 (English)
- zrjin/sherpa-onnx-zipformer-multi-zh-hans-2023-9-2 (Chinese)
- yfyeung/icefall-asr-cv-corpus-13.0-2023-03-09-en-pruned-transducer-stateless7-2023-04-17 (English)
- k2-fsa/icefall-asr-zipformer-wenetspeech-small (Chinese)
- k2-fsa/icefall-asr-zipformer-wenetspeech-large (Chinese)
- pkufool/icefall-asr-zipformer-wenetspeech-20230615 (Chinese)
- csukuangfj/sherpa-onnx-zipformer-large-en-2023-06-26 (English)
- csukuangfj/sherpa-onnx-zipformer-small-en-2023-06-26 (English)
- csukuangfj/sherpa-onnx-zipformer-en-2023-06-26 (English)
- icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 (English)
- csukuangfj/sherpa-onnx-zipformer-en-2023-04-01 (English)
- csukuangfj/sherpa-onnx-zipformer-en-2023-03-30 (English)
- Conformer-transducer-based Models
- NeMo transducer-based Models
- Zipformer-transducer-based Models
- Offline paraformer models
- Paraformer models
- csukuangfj/sherpa-onnx-paraformer-trilingual-zh-cantonese-en (Chinese + English + Cantonese 粤语)
- csukuangfj/sherpa-onnx-paraformer-en-2024-03-09 (English)
- csukuangfj/sherpa-onnx-paraformer-zh-small-2024-03-09 (Chinese + English)
- csukuangfj/sherpa-onnx-paraformer-zh-2024-03-09 (Chinese + English)
- csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28 (Chinese + English)
- csukuangfj/sherpa-onnx-paraformer-zh-2023-09-14 (Chinese + English))
- Paraformer models
- Offline CTC models
- TeleSpeech
- Whisper
- WeNet
- Small models
- Online transducer models
- Moonshine
- SenseVoice
- Speaker Diarization
- Pre-trained models
- Hugginface space for speaker diarization
- Android APKs for speaker diarization
- C API examples
- C++ API examples
- C# API examples
- Dart API examples
- Go API examples
- Java API examples
- JavaScript API examples
- Kotlin API examples
- Pascal API examples
- Python API examples
- Rust API examples
- Swift API examples
- Text-to-speech (TTS)
- Huggingface space
- Pre-trained models
- vits
- All models in a single table
- vits-melo-tts-zh_en (Chinese + English, 1 speaker)
- vits-piper-en_US-glados (English, 1 speaker)
- vits-piper-en_US-libritts_r-medium (English, 904 speakers)
- ljspeech (English, single-speaker)
- VCTK (English, multi-speaker, 109 speakers)
- csukuangfj/sherpa-onnx-vits-zh-ll (Chinese, 5 speakers)
- csukuangfj/vits-zh-hf-fanchen-C (Chinese, 187 speakers)
- csukuangfj/vits-zh-hf-fanchen-wnj (Chinese, 1 male)
- csukuangfj/vits-zh-hf-theresa (Chinese, 804 speakers)
- csukuangfj/vits-zh-hf-eula (Chinese, 804 speakers)
- aishell3 (Chinese, multi-speaker, 174 speakers)
- en_US-lessac-medium (English, single-speaker)
- vits
- WebAssembly
- Piper
- MMS
- Frequently Asked Question (FAQs)