sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10 (Cantonese, 粤语)

This model is converted from

https://huggingface.co/ASLP-lab/WSYue-ASR/tree/main/u2pp_conformer_yue

It uses 21.8k hours of training data.

Hint

If you want a Cantonese ASR model, please choose this model or sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09 (Chinese, English, Japanese, Korean, Cantonese, 中英日韩粤语)

Huggingface space

You can visit

https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition

to try this model in your browser.

Hint

You need to first select the language Cantonese and then select the model csukuangfj/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.

Android APKs

Real-time speech recognition Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/android/apk-simulate-streaming-asr.html

Hint

Please always download the latest version.

Please search for wenetspeech_yue_u2pconformer_ctc_2025_09_10.

Download

Please use the following commands to download it:

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2
tar xf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2
rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2

After downloading, you should find the following files:

ls -lh sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/

total 263264
-rw-r--r--   1 fangjun  staff   129B Sep 10 14:18 README.md
-rw-r--r--   1 fangjun  staff   128M Sep 10 14:18 model.int8.onnx
drwxr-xr-x  22 fangjun  staff   704B Sep 10 14:18 test_wavs
-rw-r--r--   1 fangjun  staff    83K Sep 10 14:18 tokens.txt

Real-time/Streaming Speech recognition from a microphone with VAD

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx

./build/bin/sherpa-onnx-vad-microphone-simulated-streaming-asr \
  --silero-vad-model=./silero_vad.onnx \
  --tokens=./{{model_path}}/tokens.txt \
  --wenet-ctc-model=./{{model_path}}/model.int8.onnx \
  --num-threads=1

Decode wave files

{% for wav in wav_files %} {{ wav.filename }} {{ ‘”’ * wav.filename|length }}

Wave filename	Content	Ground truth
{{ wav.filename }}		{{ wav.ground_truth }}

./build/bin/sherpa-onnx-offline \
  --tokens=./{{model_path}}/tokens.txt \
  --wenet-ctc-model=./{{model_path}}/model.int8.onnx \
  --num-threads=1 \
  ./{{model_path}}/test_wavs/{{ wav.filename }}

{% endfor %}