Introduction
sherpa is the deployment framework of the Next-gen Kaldi
project.
sherpa does only one thing, using a pre-trained model to transcribe speech. If you are interested in how to train your own model or fine tune a pre-trained model, please refer to icefall.
At present, sherpa has the following sub-projects:
The differences are compared below:
Installation difficulty |
hard |
|
|
NN lib |
|||
CPU Support |
x86, x86_64 |
x86, x86_64,
arm32 , arm64 |
x86, x86_64,
arm32 , arm64 ,**RISC-V** |
GPU Support |
Yes
(with
CUDA for NVIDIA GPUs) |
Yes |
Yes
(with
Vulkan for ARM GPUs) |
OS Support |
Linux, Windows,
macOS
|
Linux, Windows,
macOS,
iOS ,Android |
Linux, Windows,
macOS,
iOS ,Android |
Support batch_size > 1 |
Yes |
Yes |
|
Provided APIs |
C++, Python |
C, C++, Python,
C#, Java, Kotlin,
Swift
|
C, C++, Python,
C#, Kotlin,
Swift
|
Model types |
streaming,
non-streaming
|
streaming,
non-streaming
|
streaming only |