Introduction

sherpa is the deployment framework of the Next-gen Kaldi project.

sherpa supports deploying speech related pre-trained models on various platforms with various language bindings.

If you are interested in how to train your own model or fine tune a pre-trained model, please refer to icefall.

At present, sherpa has the following sub-projects:

k2-fsa/sherpa

k2-fsa/sherpa-onnx

k2-fsa/sherpa-ncnn

The differences are compared below:

	k2-fsa/sherpa	k2-fsa/sherpa-onnx	k2-fsa/sherpa-ncnn
Installation difficulty	hard	`easy`	`easy`
NN lib	PyTorch	onnxruntime	ncnn
CPU Support	x86, x86_64	x86, x86_64, `arm32`, `arm64`	x86, x86_64, `arm32`, `arm64`, `RISC-V`
GPU Support	Yes (with `CUDA` for NVIDIA GPUs)	Yes	Yes (with `Vulkan` for ARM GPUs)
OS Support	Linux, Windows, macOS	Linux, Windows, macOS, `iOS`, `Android`	Linux, Windows, macOS, `iOS`, `Android`
Support batch_size > 1	Yes	Yes	`No`
Provided APIs	C++, Python	C, C++, Python, C#, Java, Kotlin, Swift, Go, JavaScript, Dart Pascal, Rust	C, C++, Python, C#, Kotlin, Swift, Go
Supported functions	streaming speech recognition, non-streaming speech recognition	streaming speech recognition, non-streaming speech recognition, text-to-speech, speaker diarization, speaker identification, speaker verification, spoken language identification, audio tagging, VAD, keyword spotting,	streaming speech recognition, VAD,

We also support Triton. Please see Triton.