Decode files

In this section, we demonstrate how to use the Python API of sherpa-onnx to decode files.

Hint

We only support WAVE files of single channel and each sample should have 16-bit, while the sample rate of the file can be arbitrary and it does not need to be 16 kHz

Streaming zipformer

We use csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English) as an example below.

cd /path/to/sherpa-onnx

python3 ./python-api-examples/online-decode-files.py \
  --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \
  --encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx \
  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav \
  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav \
  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/2.wav \
  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/3.wav \
  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/8k.wav

Note

You can replace encoder-epoch-99-avg-1.onnx with encoder-epoch-99-avg-1.int8.onnx to use int8 models for decoding.

The output is given below:

Creating a resampler:
   in_sample_rate: 8000
   output_sample_rate: 16000

Started!
Done!
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav
昨天是 MONDAY TODAY IS LIBR THE DAY AFTER TOMORROW是星期三
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav
这是第一种第二种叫呃与 ALWAYS ALWAYS什么意思啊
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/2.wav
这个是频繁的啊不认识记下来 FREQUENTLY频繁的
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/3.wav
第一句是个什么时态加了 ES是一般现在时对后面还有时态写上
----------
./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/8k.wav
嗯太不要准时 IN TIME是及时叫他总是准时教他的作业那用一般现在时是没有什么感情色彩的陈述一个事实下一句话为什么要用现在进行时它的意思并不是说说他现在正在教他的
----------
num_threads: 1
decoding_method: greedy_search
Wave duration: 17.640 s
Elapsed time: 3.907 s
Real time factor (RTF): 3.907/17.640 = 0.221

Non-streaming zipformer

We use csukuangfj/sherpa-onnx-zipformer-en-2023-04-01 (English) as an example below.

cd /path/to/sherpa-onnx

python3 ./python-api-examples/offline-decode-files.py \
   --tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt \
   --encoder=./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx \
   --decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx \
   --joiner=./sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx \
   ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \
   ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav \
   ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav

Note

You can replace encoder-epoch-99-avg-1.onnx with encoder-epoch-99-avg-1.int8.onnx to use int8 models for decoding.

The output is given below:

Creating a resampler:
   in_sample_rate: 8000
   output_sample_rate: 16000

Started!
Done!
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav
 AFTER EARLY NIGHTFALL THE YELLOW LAMPS WOULD LIGHT UP HERE AND THERE THE SQUALID QUARTER OF THE BROTHELS
----------
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav
 GOD AS A DIRECT CONSEQUENCE OF THE SIN WHICH MAN THUS PUNISHED HAD GIVEN HER A LOVELY CHILD WHOSE PLACE WAS ON THAT SAME DISHONOURED BOSOM TO CONNECT HER PARENT FOR EVER WITH THE RACE AND DESCENT OF MORTALS AND TO BE FINALLY A BLESSED SOUL IN HEAVEN
----------
./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav
 YET THESE THOUGHTS AFFECTED HESTER PRYNNE LESS WITH HOPE THAN APPREHENSION
----------
num_threads: 1
decoding_method: greedy_search
Wave duration: 4.825 s
Elapsed time: 2.567 s
Real time factor (RTF): 2.567/4.825 = 0.532

Non-streaming paraformer

We use csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28 (Chinese + English) as an example below.

cd /path/to/sherpa-onnx

python3 ./python-api-examples/offline-decode-files.py \
 --tokens=./sherpa-onnx-paraformer-zh-2023-03-28/tokens.txt \
 --paraformer=./sherpa-onnx-paraformer-zh-2023-03-28/model.onnx \
 ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/0.wav \
 ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/1.wav \
 ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/2.wav \
 ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/8k.wav

Note

You can replace model.onnx with model.int8.onnx to use int8 models for decoding.

The output is given below:

Creating a resampler:
   in_sample_rate: 8000
   output_sample_rate: 16000

Started!
Done!
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/0.wav
对我做了介绍啊那么我想说的是呢大家如果对我的研究感兴趣呢你
----------
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/1.wav
重点呢想谈三个问题首先呢就是这一轮全球金融动荡的表现
----------
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/2.wav
深入的分析这一次全球金融动荡背后的根源
----------
./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/8k.wav
甚至出现交易几乎停滞的情况
----------
num_threads: 1
decoding_method: greedy_search
Wave duration: 4.204 s
Elapsed time: 1.663 s
Real time factor (RTF): 1.663/4.204 = 0.396