Export Whisper to ONNX
This section describes how to export Whisper models to onnx.
Available models
Note that we have already exported Whisper models to onnx and they are available from the following huggingface repositories:
Model type |
Huggingface repo |
国内用户 |
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-base.en |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-small.en |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-small.en |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-medium.en |
|
|
||
|
||
|
||
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium |
|
|
||
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-large-v1 |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-large-v2 |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-large-v3 |
|
|
||
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-large-v2 |
|
|
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium-aishell |
Note
You can also download them from
Models end with .en
support only English and all
other models are multilingual.
If you want to export the models by yourself or/and want to learn how the models are exported, please read below.
Export to onnx
We use
to export Whisper models to onnx.
First, let us install dependencies and download the export script
pip install torch openai-whisper onnxruntime onnx librosa soundfile
git clone https://github.com/k2-fsa/sherpa-onnx/
cd sherpa-onnx/scripts/whisper
python3 ./export-onnx.py --help
It will print the following message:
usage: export-onnx.py [-h] --model
{tiny,tiny.en,base,base.en,small,small.en,medium,medium.en,large,large-v1,large-v2,large-v3,distil-medium.en,distil-small.en,distil-large-v2,medium-aishell}
optional arguments:
-h, --help show this help message and exit
--model {tiny,tiny.en,base,base.en,small,small.en,medium,medium.en,large,large-v1,large-v2,large-v3,distil-medium.en,distil-small.en,distil-large-v2,medium-aishell}
Example 1: Export tiny.en
To export tiny.en
, we can use:
python3 ./export-onnx.py --model tiny.en
It will generate the following files:
(py38) fangjuns-MacBook-Pro:whisper fangjun$ ls -lh tiny.en-*
-rw-r--r-- 1 fangjun staff 105M Aug 7 15:43 tiny.en-decoder.int8.onnx
-rw-r--r-- 1 fangjun staff 185M Aug 7 15:43 tiny.en-decoder.onnx
-rw-r--r-- 1 fangjun staff 12M Aug 7 15:43 tiny.en-encoder.int8.onnx
-rw-r--r-- 1 fangjun staff 36M Aug 7 15:43 tiny.en-encoder.onnx
-rw-r--r-- 1 fangjun staff 816K Aug 7 15:43 tiny.en-tokens.txt
tiny.en-encoder.onnx
is the encoder model and tiny.en-decoder.onnx
is the
decoder model.
tiny.en-encoder.int8.onnx
is the quantized encoder model and tiny.en-decoder.onnx
is the
quantized decoder model.
tiny.en-tokens.txt
contains the token table, which maps an integer to a token and vice versa.
- To check whether the exported model works correctly, we can use
https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/whisper/test.py
We use https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/main/test_wavs/0.wav as the test wave.
pip install kaldi-native-fbank
wget https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/main/test_wavs/0.wav
python3 ./test.py \
--encoder ./tiny.en-encoder.onnx \
--decoder ./tiny.en-decoder.onnx \
--tokens ./tiny.en-tokens.txt \
./0.wav
To test int8
quantized models, we can use:
python3 ./test.py \
--encoder ./tiny.en-encoder.int8.onnx \
--decoder ./tiny.en-decoder.int8.onnx \
--tokens ./tiny.en-tokens.txt \
./0.wav
Example 2: Export large-v3
To export large-v3
, we can use:
python3 ./export-onnx.py --model large-v3
It will generate the following files:
(py38) fangjuns-MacBook-Pro:whisper fangjun$ ls -lh large-v3-*
-rw-r--r-- 1 fangjun staff 2.7M Jul 12 20:38 large-v3-decoder.onnx
-rw-r--r-- 1 fangjun staff 3.0G Jul 12 20:38 large-v3-decoder.weights
-rw-r--r-- 1 fangjun staff 744K Jul 12 20:35 large-v3-encoder.onnx
-rw-r--r-- 1 fangjun staff 2.8G Jul 12 20:35 large-v3-encoder.weights
-rw-r--r-- 1 fangjun staff 798K Jul 12 20:32 large-v3-tokens.txt
large-v3-encoder.onnx
is the encoder model and large-v3-decoder.onnx
is the
decoder model.
Note that for large
models, there will also be two additional weights
files.
large-v3-tokens.txt
contains the token table, which maps an integer to a token and vice versa.
- To check whether the exported model works correctly, we can use
https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/whisper/test.py
We use https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/main/test_wavs/0.wav as the test wave.
pip install kaldi-native-fbank
wget https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/main/test_wavs/0.wav
python3 ./test.py \
--encoder ./large-v3-encoder.onnx \
--decoder ./large-v3-decoder.onnx \
--tokens ./large-v3-tokens.txt \
./0.wav