k2-fsa/sherpa
k2-fsa/sherpa-ncnn
k2-fsa/sherpa-onnx
Triton
Nvidia Triton Inference Server provides a cloud and edge inferencing solution optimized for both CPUs and GPUs.
The following content describes how to deploy ASR models trained by icefall using Triton.
Environment Preparetion
Triton Server
Triton Client
Benchmark with Perf Analyzer
TensorRT acceleration