tiny.en
You can use the following commands to download the exported onnx models of tiny.en
:
Hint
Please replace tiny.en
with
base.en
, small.en
, medium.en
, distil-small.en
,
tiny
, base
, small
, and medium
if you want to try a different type of model.
cd /path/to/sherpa-onnx
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2
tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2
Please check that the file sizes of the downloaded models are correct. See
the file size of *.onnx
files below.
(py38) fangjuns-MacBook-Pro:sherpa-onnx-whisper-tiny.en fangjun$ ls -lh *.onnx
-rw-r--r-- 1 fangjun staff 105M Aug 7 16:22 tiny.en-decoder.int8.onnx
-rw-r--r-- 1 fangjun staff 185M Aug 7 16:23 tiny.en-decoder.onnx
-rw-r--r-- 1 fangjun staff 12M Aug 7 16:22 tiny.en-encoder.int8.onnx
-rw-r--r-- 1 fangjun staff 36M Aug 7 16:22 tiny.en-encoder.onnx
To use the downloaded files to decode waves, please run:
Hint
Please first follow Installation to build sherpa-onnx before you continue.
cd /path/to/sherpa-onnx
./build/bin/sherpa-onnx-offline \
--whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \
--whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \
--tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \
./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \
./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav
To use int8
quantized models, please use:
cd /path/to/sherpa-onnx
./build/bin/sherpa-onnx-offline \
--whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx \
--whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx \
--tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \
./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \
./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav
Real-time factor (RTF) on Raspberry Pi 4 Model B
One of the test command is given below:
./sherpa-onnx-offline \
--num-threads=1 \
--whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \
--whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \
--tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav
And its output is:
/root/fangjun/open-source/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx-offline --num-threads=1 --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav
OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx", decoder="./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx"), tokens="./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt", num_threads=1, debug=False, provider="cpu", model_type=""), lm_config=OfflineLMConfig(model="", scale=0.5), decoding_method="greedy_search", max_active_paths=4, context_score=1.5)
Creating recognizer ...
Started
Done!
./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav
{"text":" God, as a direct consequence of the sin which man thus punished, had given her a lovely child, whose place was on that same dishonored bosom to connect her parent forever with the race and descent of mortals, and to be finally a blessed soul in heaven.","timestamps":"[]","tokens":[" God",","," as"," a"," direct"," consequence"," of"," the"," sin"," which"," man"," thus"," punished",","," had"," given"," her"," a"," lovely"," child",","," whose"," place"," was"," on"," that"," same"," dishon","ored"," bos","om"," to"," connect"," her"," parent"," forever"," with"," the"," race"," and"," descent"," of"," mortals",","," and"," to"," be"," finally"," a"," blessed"," soul"," in"," heaven","."]}
----
num threads: 1
decoding method: greedy_search
Elapsed seconds: 11.454 s
Real time factor (RTF): 11.454 / 16.715 = 0.685
The following table compares the RTF between different number of threads and types of onnx models:
Model type |
Number of threads |
RTF |
float32 |
1 |
0.685 |
float32 |
2 |
0.559 |
float32 |
3 |
0.526 |
float32 |
4 |
0.520 |
int8 |
1 |
0.547 |
int8 |
2 |
0.431 |
int8 |
3 |
0.398 |
int8 |
4 |
0.386 |