Decode files
In this section, we demonstrate how to use the Python API of sherpa-onnx to decode files.
Hint
We only support WAVE files of single channel and each sample should have 16-bit, while the sample rate of the file can be arbitrary and it does not need to be 16 kHz
Streaming zipformer
We use csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English) as an example below.
cd /path/to/sherpa-onnx
python3 ./python-api-examples/online-decode-files.py \
--tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \
--encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx \
--decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \
--joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx \
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav \
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav \
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/2.wav \
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/3.wav \
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/8k.wav
Hint
online-decode-files.py
is from https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/online-decode-files.py
Note
You can replace encoder-epoch-99-avg-1.onnx
with encoder-epoch-99-avg-1.int8.onnx
to use int8
models for decoding.
The output is given below:
Creating a resampler:
in_sample_rate: 8000
output_sample_rate: 16000
Started!
Done!
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav
昨天是 MONDAY TODAY IS LIBR THE DAY AFTER TOMORROW是星期三
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav
这是第一种第二种叫呃与 ALWAYS ALWAYS什么意思啊
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/2.wav
这个是频繁的啊不认识记下来 FREQUENTLY频繁的
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/3.wav
第一句是个什么时态加了 ES是一般现在时对后面还有时态写上
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/8k.wav
嗯太不要准时 IN TIME是及时叫他总是准时教他的作业那用一般现在时是没有什么感情色彩的陈述一个事实下一句话为什么要用现在进行时它的意思并不是说说他现在正在教他的
----------
num_threads: 1
decoding_method: greedy_search
Wave duration: 17.640 s
Elapsed time: 3.907 s
Real time factor (RTF): 3.907/17.640 = 0.221
Non-streaming zipformer
We use csukuangfj/sherpa-onnx-zipformer-en-2023-04-01 (English) as an example below.
cd /path/to/sherpa-onnx
python3 ./python-api-examples/offline-decode-files.py \
--tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt \
--encoder=./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx \
--decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx \
--joiner=./sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx \
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav \
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav
Hint
offline-decode-files.py
is from https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/offline-decode-files.py
Note
You can replace encoder-epoch-99-avg-1.onnx
with encoder-epoch-99-avg-1.int8.onnx
to use int8
models for decoding.
The output is given below:
Creating a resampler:
in_sample_rate: 8000
output_sample_rate: 16000
Started!
Done!
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav
AFTER EARLY NIGHTFALL THE YELLOW LAMPS WOULD LIGHT UP HERE AND THERE THE SQUALID QUARTER OF THE BROTHELS
----------
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav
GOD AS A DIRECT CONSEQUENCE OF THE SIN WHICH MAN THUS PUNISHED HAD GIVEN HER A LOVELY CHILD WHOSE PLACE WAS ON THAT SAME DISHONOURED BOSOM TO CONNECT HER PARENT FOR EVER WITH THE RACE AND DESCENT OF MORTALS AND TO BE FINALLY A BLESSED SOUL IN HEAVEN
----------
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav
YET THESE THOUGHTS AFFECTED HESTER PRYNNE LESS WITH HOPE THAN APPREHENSION
----------
num_threads: 1
decoding_method: greedy_search
Wave duration: 4.825 s
Elapsed time: 2.567 s
Real time factor (RTF): 2.567/4.825 = 0.532
Non-streaming paraformer
We use csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28 (Chinese + English) as an example below.
cd /path/to/sherpa-onnx
python3 ./python-api-examples/offline-decode-files.py \
--tokens=./sherpa-onnx-paraformer-zh-2023-03-28/tokens.txt \
--paraformer=./sherpa-onnx-paraformer-zh-2023-03-28/model.onnx \
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/0.wav \
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/1.wav \
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/2.wav \
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/8k.wav
Note
You can replace model.onnx
with model.int8.onnx
to use int8
models for decoding.
The output is given below:
Creating a resampler:
in_sample_rate: 8000
output_sample_rate: 16000
Started!
Done!
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/0.wav
对我做了介绍啊那么我想说的是呢大家如果对我的研究感兴趣呢你
----------
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/1.wav
重点呢想谈三个问题首先呢就是这一轮全球金融动荡的表现
----------
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/2.wav
深入的分析这一次全球金融动荡背后的根源
----------
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/8k.wav
甚至出现交易几乎停滞的情况
----------
num_threads: 1
decoding_method: greedy_search
Wave duration: 4.204 s
Elapsed time: 1.663 s
Real time factor (RTF): 1.663/4.204 = 0.396