Introduction
sherpa is the deployment framework of the Next-gen Kaldi
project.
sherpa supports deploying speech related pre-trained models on various platforms with various language bindings.
If you are interested in how to train your own model or fine tune a pre-trained model, please refer to icefall.
At present, sherpa has the following sub-projects:
The differences are compared below:
Installation difficulty |
hard |
|
|
NN lib |
|||
CPU Support |
x86, x86_64 |
x86, x86_64,
arm32 , arm64 |
x86, x86_64,
arm32 , arm64 ,**RISC-V** |
GPU Support |
Yes
(with
CUDA for NVIDIA GPUs) |
Yes |
Yes
(with
Vulkan for ARM GPUs) |
OS Support |
Linux, Windows,
macOS
|
Linux, Windows,
macOS,
iOS ,Android |
Linux, Windows,
macOS,
iOS ,Android |
Support batch_size > 1 |
Yes |
Yes |
|
Provided APIs |
C++, Python |
C, C++, Python,
C#, Java, Kotlin,
Swift, Go,
JavaScript, Dart
Pascal, Rust
|
C, C++, Python,
C#, Kotlin,
Swift, Go
|
Supported functions |
streaming speech recognition,
non-streaming speech recognition
|
streaming speech recognition,
non-streaming speech recognition,
text-to-speech,
speaker diarization,
speaker identification,
speaker verification,
spoken language identification,
audio tagging,
VAD,
keyword spotting,
|
streaming speech recognition,
VAD,
|