Pre-trained Models

This page describes how to download pre-trained Dolphin models.

sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02

This model is converted from https://huggingface.co/DataoceanAI/dolphin-base

In the following, we describe how to download it.

Download

Please use the following commands to download it:

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2
tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2
rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2

After downloading, you should find the following files:

ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02

total 100M
-rw-r--r-- 1 501 staff  99M Apr  2 10:19 model.int8.onnx
-rw-r--r-- 1 501 staff  141 Apr  2 10:19 README.md
drwxr-xr-x 2 501 staff 4.0K Apr  2 10:19 test_wavs
-rw-r--r-- 1 501 staff 493K Apr  2 10:19 tokens.txt

Decode a file

Please use the following command to decode a wave file:

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt \
  --dolphin-model=./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx \
  --num-threads=1 \
  ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav

You should see the following output:

/project/sherpa-onnx/csrc/parse-options.cc:Read:375 sherpa-onnx-offline --tokens=./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt --dolphin-model=./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx --num-threads=1 ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav

OfflineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80, low_freq=20, high_freq=-400, dither=0, normalize_samples=True, snip_edges=False), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="", decoder="", language="", task="transcribe", tail_paddings=-1), fire_red_asr=OfflineFireRedAsrModelConfig(encoder="", decoder=""), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), wenet_ctc=OfflineWenetCtcModelConfig(model=""), sense_voice=OfflineSenseVoiceModelConfig(model="", language="auto", use_itn=False), moonshine=OfflineMoonshineModelConfig(preprocessor="", encoder="", uncached_decoder="", cached_decoder=""), dolphin=OfflineDolphinModelConfig(model="./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx"), telespeech_ctc="", tokens="./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt", num_threads=1, debug=False, provider="cpu", model_type="", modeling_unit="cjkchar", bpe_vocab=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5, blank_penalty=0, rule_fsts="", rule_fars="")
Creating recognizer ...
Started
Done!

./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav
{"lang": "", "emotion": "", "event": "", "text": " 对我做了介绍哈那么我想说的是呢大家如果对我的研究感兴趣呢。", "timestamps": [0.04, 0.28, 0.60, 0.84, 1.32, 1.76, 2.20, 2.36, 2.72, 3.24, 3.48, 3.72, 4.12, 4.40, 4.76, 5.52], "tokens":[" ", "对我", "做了", "介绍", "哈", "那么", "我想", "说的是", "呢", "大家", "如果", "对我的", "研究", "感兴趣", "呢", "。"], "words": []}
----
num threads: 1
decoding method: greedy_search
Elapsed seconds: 0.527 s
Real time factor (RTF): 0.527 / 5.611 = 0.094

sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02

This model is converted from https://huggingface.co/DataoceanAI/dolphin-base

In the following, we describe how to download it.

Download

Please use the following commands to download it:

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02.tar.bz2
tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02.tar.bz2
rm sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02.tar.bz2

After downloading, you should find the following files:

ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02

total 303M
-rw-r--r-- 1 501 staff 303M Apr  2 10:19 model.onnx
-rw-r--r-- 1 501 staff  142 Apr  2 10:19 README.md
drwxr-xr-x 2 501 staff 4.0K Apr  2 10:19 test_wavs
-rw-r--r-- 1 501 staff 493K Apr  2 10:19 tokens.txt

Decode a file

Please use the following command to decode a wave file:

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/tokens.txt \
  --dolphin-model=./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/model.onnx \
  --num-threads=1 \
  ./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/test_wavs/0.wav

You should see the following output:

/project/sherpa-onnx/csrc/parse-options.cc:Read:375 sherpa-onnx-offline --tokens=./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/tokens.txt --dolphin-model=./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/model.onnx --num-threads=1 ./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/test_wavs/0.wav

OfflineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80, low_freq=20, high_freq=-400, dither=0, normalize_samples=True, snip_edges=False), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="", decoder="", language="", task="transcribe", tail_paddings=-1), fire_red_asr=OfflineFireRedAsrModelConfig(encoder="", decoder=""), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), wenet_ctc=OfflineWenetCtcModelConfig(model=""), sense_voice=OfflineSenseVoiceModelConfig(model="", language="auto", use_itn=False), moonshine=OfflineMoonshineModelConfig(preprocessor="", encoder="", uncached_decoder="", cached_decoder=""), dolphin=OfflineDolphinModelConfig(model="./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/model.onnx"), telespeech_ctc="", tokens="./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/tokens.txt", num_threads=1, debug=False, provider="cpu", model_type="", modeling_unit="cjkchar", bpe_vocab=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5, blank_penalty=0, rule_fsts="", rule_fars="")
Creating recognizer ...
Started
Done!

./sherpa-onnx-dolphin-base-ctc-multi-lang-2025-04-02/test_wavs/0.wav
{"lang": "", "emotion": "", "event": "", "text": " 对我做了介绍啊那么我想说的是呢大家如果对我的研究感兴趣呢", "timestamps": [0.04, 0.28, 0.60, 0.84, 1.32, 1.76, 2.20, 2.36, 2.72, 3.24, 3.48, 3.72, 4.12, 4.40, 4.76], "tokens":[" ", "对我", "做了", "介绍", "啊", "那么", "我想", "说的是", "呢", "大家", "如果", "对我的", "研究", "感兴趣", "呢"], "words": []}
----
num threads: 1
decoding method: greedy_search
Elapsed seconds: 1.047 s
Real time factor (RTF): 1.047 / 5.611 = 0.187

sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02

This model is converted from https://huggingface.co/DataoceanAI/dolphin-small

In the following, we describe how to download it.

Download

Please use the following commands to download it:

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02.tar.bz2
tar xvf sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02.tar.bz2
rm sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02.tar.bz2

After downloading, you should find the following files:

ls -lh sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02

total 239M
-rw-r--r-- 1 501 staff 239M Apr  2 10:20 model.int8.onnx
-rw-r--r-- 1 501 staff  141 Apr  2 10:19 README.md
drwxr-xr-x 2 501 staff 4.0K Apr  2 10:19 test_wavs
-rw-r--r-- 1 501 staff 493K Apr  2 10:19 tokens.txt

Decode a file

Please use the following command to decode a wave file:

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/tokens.txt \
  --dolphin-model=./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/model.int8.onnx \
  --num-threads=1 \
  ./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav

You should see the following output:

/project/sherpa-onnx/csrc/parse-options.cc:Read:375 sherpa-onnx-offline --tokens=./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/tokens.txt --dolphin-model=./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/model.int8.onnx --num-threads=1 ./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav

OfflineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80, low_freq=20, high_freq=-400, dither=0, normalize_samples=True, snip_edges=False), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="", decoder="", language="", task="transcribe", tail_paddings=-1), fire_red_asr=OfflineFireRedAsrModelConfig(encoder="", decoder=""), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), wenet_ctc=OfflineWenetCtcModelConfig(model=""), sense_voice=OfflineSenseVoiceModelConfig(model="", language="auto", use_itn=False), moonshine=OfflineMoonshineModelConfig(preprocessor="", encoder="", uncached_decoder="", cached_decoder=""), dolphin=OfflineDolphinModelConfig(model="./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/model.int8.onnx"), telespeech_ctc="", tokens="./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/tokens.txt", num_threads=1, debug=False, provider="cpu", model_type="", modeling_unit="cjkchar", bpe_vocab=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5, blank_penalty=0, rule_fsts="", rule_fars="")
Creating recognizer ...
Started
Done!

./sherpa-onnx-dolphin-small-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav
{"lang": "", "emotion": "", "event": "", "text": " 对我做了介绍啊,那么我想说的是呢大家如果对我的研究感兴趣呢。", "timestamps": [0.00, 0.32, 0.60, 0.84, 1.32, 1.68, 1.80, 2.20, 2.36, 2.72, 3.24, 3.48, 3.72, 4.12, 4.40, 4.76, 5.52], "tokens":[" ", "对我", "做了", "介绍", "啊", ",", "那么", "我想", "说的是", "呢", "大家", "如果", "对我的", "研究", "感兴趣", "呢", "。"], "words": []}
----
num threads: 1
decoding method: greedy_search
Elapsed seconds: 1.187 s
Real time factor (RTF): 1.187 / 5.611 = 0.212

sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02

This model is converted from https://huggingface.co/DataoceanAI/dolphin-small

In the following, we describe how to download it.

Download

Please use the following commands to download it:

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02.tar.bz2
tar xvf sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02.tar.bz2
rm sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02.tar.bz2

After downloading, you should find the following files:

ls -lh sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02

total 784M
-rw-r--r-- 1 501 staff 783M Apr  2 10:20 model.onnx
-rw-r--r-- 1 501 staff  141 Apr  2 10:20 README.md
drwxr-xr-x 2 501 staff 4.0K Apr  2 10:20 test_wavs
-rw-r--r-- 1 501 staff 493K Apr  2 10:20 tokens.txt

Decode a file

Please use the following command to decode a wave file:

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/tokens.txt \
  --dolphin-model=./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/model.onnx \
  --num-threads=1 \
  ./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/test_wavs/0.wav

You should see the following output:

/project/sherpa-onnx/csrc/parse-options.cc:Read:375 sherpa-onnx-offline --tokens=./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/tokens.txt --dolphin-model=./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/model.onnx --num-threads=1 ./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/test_wavs/0.wav

OfflineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80, low_freq=20, high_freq=-400, dither=0, normalize_samples=True, snip_edges=False), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="", decoder="", language="", task="transcribe", tail_paddings=-1), fire_red_asr=OfflineFireRedAsrModelConfig(encoder="", decoder=""), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), wenet_ctc=OfflineWenetCtcModelConfig(model=""), sense_voice=OfflineSenseVoiceModelConfig(model="", language="auto", use_itn=False), moonshine=OfflineMoonshineModelConfig(preprocessor="", encoder="", uncached_decoder="", cached_decoder=""), dolphin=OfflineDolphinModelConfig(model="./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/model.onnx"), telespeech_ctc="", tokens="./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/tokens.txt", num_threads=1, debug=False, provider="cpu", model_type="", modeling_unit="cjkchar", bpe_vocab=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5, blank_penalty=0, rule_fsts="", rule_fars="")
Creating recognizer ...
Started
Done!

./sherpa-onnx-dolphin-small-ctc-multi-lang-2025-04-02/test_wavs/0.wav
{"lang": "", "emotion": "", "event": "", "text": " 对我做了介绍啊,那么我想说的是呢,大家如果对我的研究感兴趣呢。", "timestamps": [0.00, 0.32, 0.60, 0.84, 1.32, 1.68, 1.80, 2.20, 2.36, 2.72, 3.08, 3.24, 3.48, 3.72, 4.12, 4.40, 4.76, 5.52], "tokens":[" ", "对我", "做了", "介绍", "啊", ",", "那么", "我想", "说的是", "呢", ",", "大家", "如果", "对我的", "研究", "感兴趣", "呢", "。"], "words": []}
----
num threads: 1
decoding method: greedy_search
Elapsed seconds: 1.436 s
Real time factor (RTF): 1.436 / 5.611 = 0.256