Streaming ASR
This page describes how to use the C++ API of sherpa for streaming/online ASR.
Warning
It supports only models from https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/conv_emformer_transducer_stateless2 at present.
Please refer to Installation for installation.
After running make -j
, you should find the following files:
lib/libsherpa_online_recognizer.so
include/sherpa/cpp_api/online_recognizer.h
include/sherpa/cpp_api/online_stream.h
You can include the above two header files in your application and link
libsherpa_online_recognizer.so
with you executable to use the C++ APIs.
https://github.com/k2-fsa/sherpa/blob/master/sherpa/cpp_api/test_online_recognizer_microphone.cc
shows how to use the C++ API for real-time speech recognition with a microphone.
After running make -j
, you can also find an executable bin/test_online_recognizer_microphone
.
The following shows how to use it:
cd /path/to/sherpa/build
git lfs install
git clone https://huggingface.co/Zengwei/icefall-asr-librispeech-conv-emformer-transducer-stateless2-2022-07-05
./bin/test_online_recognizer_microphone \
./icefall-asr-librispeech-conv-emformer-transducer-stateless2-2022-07-05/exp/cpu-jit-epoch-30-avg-10-torch-1.10.0.pt \
./icefall-asr-librispeech-conv-emformer-transducer-stateless2-2022-07-05/data/lang_bpe_500/tokens.txt
It will print something like below:
num devices: 4
Use default device: 2
Name: MacBook Pro Microphone
Max input channels: 1
Started
Say something and you will see the recognition result printed to the console in real-time.
You can find a demo below: