If you have pre-downloaded the LibriSpeech
dataset and the musan dataset, say,
they are saved in /tmp/LibriSpeech and /tmp/musan, you can modify
the dl_dir variable in ./prepare.sh to point to /tmp so that
./prepare.sh won’t re-download them.
Note
All generated files by ./prepare.sh, e.g., features, lexicon, etc,
are saved in ./data directory.
We provide the following YouTube video showing how to run ./prepare.sh.
Note
To get the latest news of next-gen Kaldi, please subscribe
the following YouTube channel by Nadira Povey:
shows you the training options that can be passed from the commandline.
The following options are used quite often:
--full-libri
If it’s True, the training part uses all the training data, i.e.,
960 hours. Otherwise, the training part uses only the subset
train-clean-100, which has 100 hours of training data.
Caution
The training set is perturbed by speed with two factors: 0.9 and 1.1.
If --full-libri is True, each epoch actually processes
3x960==2880 hours of data.
--num-epochs
It is the number of epochs to train. For instance,
./conformer_ctc/train.py--num-epochs30 trains for 30 epochs
and generates epoch-0.pt, epoch-1.pt, …, epoch-29.pt
in the folder ./conformer_ctc/exp.
--start-epoch
It’s used to resume training.
./conformer_ctc/train.py--start-epoch10 loads the
checkpoint ./conformer_ctc/exp/epoch-9.pt and starts
training from epoch 10, based on the state from epoch 9.
--world-size
It is used for multi-GPU single-machine DDP training.
If it is 1, then no DDP training is used.
If it is 2, then GPU 0 and GPU 1 are used for DDP training.
The following shows some use cases with it.
Use case 1: You have 4 GPUs, but you only want to use GPU 0 and
GPU 2 for training. You can do the following:
Only multi-GPU single-machine DDP training is implemented at present.
Multi-GPU multi-machine DDP training will be added later.
--max-duration
It specifies the number of seconds over all utterances in a
batch, before padding.
If you encounter CUDA OOM, please reduce it. For instance, if
your are using V100 NVIDIA GPU, we recommend you to set it to 200.
Hint
Due to padding, the number of seconds of all utterances in a
batch will usually be larger than --max-duration.
A larger value for --max-duration may cause OOM during training,
while a smaller value may increase the training time. You have to
tune it.
There are some training options, e.g., weight decay,
number of warmup steps, results dir, etc,
that are not passed from the commandline.
They are pre-configured by the function get_params() in
conformer_ctc/train.py
You don’t need to change these pre-configured parameters. If you really need to change
them, please modify ./conformer_ctc/train.py directly.
Training logs and checkpoints are saved in conformer_ctc/exp.
You will find the following files in that directory:
epoch-0.pt, epoch-1.pt, …
These are checkpoint files, containing model state_dict and optimizer state_dict.
To resume training from some checkpoint, say epoch-10.pt, you can use:
$./conformer_ctc/train.py--start-epoch11
tensorboard/
This folder contains TensorBoard logs. Training loss, validation loss, learning
rate, etc, are recorded in these logs. You can visualize them by:
$cdconformer_ctc/exp/tensorboard
$tensorboarddevupload--logdir.--description"Conformer CTC training for LibriSpeech with icefall"
It will print something like below:
TensorFlowinstallationnotfound-runningwithreducedfeatureset.Uploadstartedandwillcontinuereadinganynewdataasit's added to the logdir.Tostopuploading,pressCtrl-C.Newexperimentcreated.ViewyourTensorBoardat:https://tensorboard.dev/experiment/lzGnETjwRxC3yghNMd4kPw/[2021-08-24T16:42:43]Startedscanninglogdir.Uploading4540scalars...
Note there is a URL in the above output, click it and you will see
the following screenshot:
This specifies the decoding method. This script supports 7 decoding methods.
As for ctc decoding, it uses a sentence piece model to convert word pieces to words.
And it needs neither a lexicon nor an n-gram LM.
For example, the following command uses CTC topology for decoding:
$ cd egs/librispeech/ASR
$ ./conformer_ctc/decode.py --method ctc-decoding --max-duration 300
# Caution: The above command is tested with a model with vocab size 500.
And the following command uses attention decoder for rescoring:
It is used to scale down lattice scores so that there are more unique
paths for rescoring.
--max-duration
It has the same meaning as the one during training. A larger
value may cause OOM.
Here are some results for CTC decoding with a vocab size of 500:
Usage:
$cdegs/librispeech/ASR
# NOTE: Tested with a model with vocab size 500.# It won't work for a model with vocab size 5000.
$./conformer_ctc/decode.py\--epoch25\--avg1\--max-duration300\--exp-dirconformer_ctc/exp\--lang-dirdata/lang_bpe_500\--methodctc-decoding
It uses a modified CTC topology while building HLG.
data/lang_bpe_500/bpe.model
It is a sentencepiece model. You can use it to reproduce our results.
data/lang_bpe_500/tokens.txt
It contains tokens and their IDs, generated from bpe.model.
Provided only for convenience so that you can look up the SOS/EOS ID easily.
data/lang_bpe_500/words.txt
It contains words and their IDs.
data/lm/G_4_gram.pt
It is a 4-gram LM, used for n-gram LM rescoring.
exp/pretrained.pt
It contains pre-trained model parameters, obtained by averaging
checkpoints from epoch-23.pt to epoch-77.pt.
Note: We have removed optimizer state_dict to reduce file size.
exp/cpu_jit.pt
It contains torch scripted model that can be deployed in C++.
test_wavs/*.wav
It contains some test sound files from LibriSpeech test-clean dataset.
test_wavs/trans.txt
It contains the reference transcripts for the sound files in test_wavs/.
The information of the test sound files is listed below:
2021-11-1012:12:29,554INFO[pretrained.py:260]{'sample_rate':16000,'subsampling_factor':4,'vgg_frontend':False,'use_feat_batchnorm':True,'feature_dim':80,'nhead':8,'attention_dim':512,'num_decoder_layers':0,'search_beam':20,'output_beam':8,'min_active_states':30,'max_active_states':10000,'use_double_scores':True,'checkpoint':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/pretrained.pt','words_file':None,'HLG':None,'bpe_model':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/bpe.model','method':'ctc-decoding','G':None,'num_paths':100,'ngram_lm_scale':1.3,'attention_decoder_scale':1.2,'nbest_scale':0.5,'sos_id':1,'num_classes':500,'eos_id':1,'sound_files':['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav'],'env_info':{'k2-version':'1.9','k2-build-type':'Release','k2-with-cuda':True,'k2-git-sha1':'7178d67e594bc7fa89c2b331ad7bd1c62a6a9eb4','k2-git-date':'Tue Oct 26 22:12:54 2021','lhotse-version':'0.11.0.dev+missing.version.file','torch-cuda-available':True,'torch-cuda-version':'10.1','python-version':'3.8','icefall-git-branch':'bpe-500','icefall-git-sha1':'8d93169-dirty','icefall-git-date':'Wed Nov 10 11:52:44 2021','icefall-path':'/ceph-fj/fangjun/open-source-2/icefall-fix','k2-path':'/ceph-fj/fangjun/open-source-2/k2-bpe-500/k2/python/k2/__init__.py','lhotse-path':'/ceph-fj/fangjun/open-source-2/lhotse-bpe-500/lhotse/__init__.py'}}2021-11-1012:12:29,554INFO[pretrained.py:266]device:cuda:02021-11-1012:12:29,554INFO[pretrained.py:268]Creatingmodel2021-11-1012:12:35,600INFO[pretrained.py:285]ConstructingFbankcomputer2021-11-1012:12:35,601INFO[pretrained.py:295]Readingsoundfiles:['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav']2021-11-1012:12:35,758INFO[pretrained.py:301]Decodingstarted2021-11-1012:12:36,025INFO[pretrained.py:319]UseCTCdecoding2021-11-1012:12:36,204INFO[pretrained.py:425]./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav:AFTEREARLYNIGHTFALLTHEYELLOWLAMPSWOULDLIGHTUPHEREANDTHERETHESQUALIDQUARTEROFTHEBROFFELS./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav:GODASADIRECTCONSEQUENCEOFTHESINWHICHMANTHUSPUNISHEDHADGIVENHERALOVELYCHILDWHOSEPLACEWASONTHATSAMEDISHONOREDBOSOMTOCONNECTHERPARENTFOREVERWITHTHERACEANDDESCENTOFMORTALSANDTOBEFINALLYABLESSEDSOULINHEAVEN./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav:YETTHESETHOUGHTSAFFECTEDHESTERPRYNNELESSWITHHOPETHANAPPREHENSION2021-11-1012:12:36,204INFO[pretrained.py:427]DecodingDone
2021-11-1013:33:03,723INFO[pretrained.py:260]{'sample_rate':16000,'subsampling_factor':4,'vgg_frontend':False,'use_feat_batchnorm':True,'feature_dim':80,'nhead':8,'attention_dim':512,'num_decoder_layers':0,'search_beam':20,'output_beam':8,'min_active_states':30,'max_active_states':10000,'use_double_scores':True,'checkpoint':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/pretrained.pt','words_file':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/words.txt','HLG':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt','bpe_model':None,'method':'1best','G':None,'num_paths':100,'ngram_lm_scale':1.3,'attention_decoder_scale':1.2,'nbest_scale':0.5,'sos_id':1,'num_classes':500,'eos_id':1,'sound_files':['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav'],'env_info':{'k2-version':'1.9','k2-build-type':'Release','k2-with-cuda':True,'k2-git-sha1':'7178d67e594bc7fa89c2b331ad7bd1c62a6a9eb4','k2-git-date':'Tue Oct 26 22:12:54 2021','lhotse-version':'0.11.0.dev+missing.version.file','torch-cuda-available':True,'torch-cuda-version':'10.1','python-version':'3.8','icefall-git-branch':'bpe-500','icefall-git-sha1':'8d93169-dirty','icefall-git-date':'Wed Nov 10 11:52:44 2021','icefall-path':'/ceph-fj/fangjun/open-source-2/icefall-fix','k2-path':'/ceph-fj/fangjun/open-source-2/k2-bpe-500/k2/python/k2/__init__.py','lhotse-path':'/ceph-fj/fangjun/open-source-2/lhotse-bpe-500/lhotse/__init__.py'}}2021-11-1013:33:03,723INFO[pretrained.py:266]device:cuda:02021-11-1013:33:03,723INFO[pretrained.py:268]Creatingmodel2021-11-1013:33:09,775INFO[pretrained.py:285]ConstructingFbankcomputer2021-11-1013:33:09,776INFO[pretrained.py:295]Readingsoundfiles:['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav']2021-11-1013:33:09,881INFO[pretrained.py:301]Decodingstarted2021-11-1013:33:09,951INFO[pretrained.py:352]LoadingHLGfrom./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt2021-11-1013:33:13,234INFO[pretrained.py:384]UseHLGdecoding2021-11-1013:33:13,571INFO[pretrained.py:425]./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav:AFTEREARLYNIGHTFALLTHEYELLOWLAMPSWOULDLIGHTUPHEREANDTHERETHESQUALIDQUARTEROFTHEBROTHELS./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav:GODASADIRECTCONSEQUENCEOFTHESINWHICHMANTHUSPUNISHEDHADGIVENHERALOVELYCHILDWHOSEPLACEWASONTHATSAMEDISHONOREDBOSOMTOCONNECTHERPARENTFOREVERWITHTHERACEANDDESCENTOFMORTALSANDTOBEFINALLYABLESSEDSOULINHEAVEN./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav:YETTHESETHOUGHTSAFFECTEDHESTERPRYNNELESSWITHHOPETHANAPPREHENSION2021-11-1013:33:13,571INFO[pretrained.py:427]DecodingDone
2021-11-1013:39:55,857INFO[pretrained.py:260]{'sample_rate':16000,'subsampling_factor':4,'vgg_frontend':False,'use_feat_batchnorm':True,'feature_dim':80,'nhead':8,'attention_dim':512,'num_decoder_layers':0,'search_beam':20,'output_beam':8,'min_active_states':30,'max_active_states':10000,'use_double_scores':True,'checkpoint':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/pretrained.pt','words_file':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/words.txt','HLG':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt','bpe_model':None,'method':'whole-lattice-rescoring','G':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lm/G_4_gram.pt','num_paths':100,'ngram_lm_scale':1.0,'attention_decoder_scale':1.2,'nbest_scale':0.5,'sos_id':1,'num_classes':500,'eos_id':1,'sound_files':['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav'],'env_info':{'k2-version':'1.9','k2-build-type':'Release','k2-with-cuda':True,'k2-$it-sha1':'7178d67e594bc7fa89c2b331ad7bd1c62a6a9eb4','k2-git-date':'Tue Oct 26 22:12:54 2021','lhotse-version':'0.11.0.dev+missing.version.file','torch-cuda-available':True,'torch-cuda-version':'10.1','python-version':'3.8','icefall-git-branch':'bpe-500','icefall-git-sha1':'8d93169-dirty','icefall-git-date':'Wed Nov 10 11:52:44 2021','icefall-path':'/ceph-fj/fangjun/open-source-2/icefall-fix','k2-path':'/ceph-fj/fangjun/open-source-2/k2-bpe-500/k2/python/k2/__init__.py','lhotse-path':'/ceph-fj/fangjun/open-source-2/lhotse-bpe-500/lhotse/__init__.py'}}2021-11-1013:39:55,858INFO[pretrained.py:266]device:cuda:02021-11-1013:39:55,858INFO[pretrained.py:268]Creatingmodel2021-11-1013:40:01,979INFO[pretrained.py:285]ConstructingFbankcomputer2021-11-1013:40:01,980INFO[pretrained.py:295]Readingsoundfiles:['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav']2021-11-1013:40:02,055INFO[pretrained.py:301]Decodingstarted2021-11-1013:40:02,117INFO[pretrained.py:352]LoadingHLGfrom./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt2021-11-1013:40:05,051INFO[pretrained.py:363]LoadingGfrom./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lm/G_4_gram.pt2021-11-1013:40:18,959INFO[pretrained.py:389]UseHLGdecoding+LMrescoring2021-11-1013:40:19,546INFO[pretrained.py:425]./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav:AFTEREARLYNIGHTFALLTHEYELLOWLAMPSWOULDLIGHTUPHEREANDTHERETHESQUALIDQUARTEROFTHEBROTHELS./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav:GODASADIRECTCONSEQUENCEOFTHESINWHICHMANTHUSPUNISHEDHADGIVENHERALOVELYCHILDWHOSEPLACEWASONTHATSAMEDISHONOREDBOSOMTOCONNECTHERPARENTFOREVERWITHTHERACEANDDESCENTOFMORTALSANDTOBEFINALLYABLESSEDSOULINHEAVEN./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav:YETTHESETHOUGHTSAFFECTEDHESTERPRYNNELESSWITHHOPETHANAPPREHENSION2021-11-1013:40:19,546INFO[pretrained.py:427]DecodingDone
It uses an n-gram LM to rescore the decoding lattice, extracts
n paths from the rescored lattice, recores the extracted paths with
an attention decoder. The path with the highest score is the decoding result.
The command to run HLG decoding + LM rescoring + attention decoder rescoring is:
2021-11-1013:43:45,598INFO[pretrained.py:260]{'sample_rate':16000,'subsampling_factor':4,'vgg_frontend':False,'use_feat_batchnorm':True,'feature_dim':80,'nhead':8,'attention_dim':512,'num_decoder_layers':6,'search_beam':20,'output_beam':8,'min_active_states':30,'max_active_states':10000,'use_double_scores':True,'checkpoint':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/pretrained.pt','words_file':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/words.txt','HLG':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt','bpe_model':None,'method':'attention-decoder','G':'./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lm/G_4_gram.pt','num_paths':100,'ngram_lm_scale':2.0,'attention_decoder_scale':2.0,'nbest_scale':0.5,'sos_id':1,'num_classes':500,'eos_id':1,'sound_files':['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav'],'env_info':{'k2-version':'1.9','k2-build-type':'Release','k2-with-cuda':True,'k2-git-sha1':'7178d67e594bc7fa89c2b331ad7bd1c62a6a9eb4','k2-git-date':'Tue Oct 26 22:12:54 2021','lhotse-version':'0.11.0.dev+missing.version.file','torch-cuda-available':True,'torch-cuda-version':'10.1','python-version':'3.8','icefall-git-branch':'bpe-500','icefall-git-sha1':'8d93169-dirty','icefall-git-date':'Wed Nov 10 11:52:44 2021','icefall-path':'/ceph-fj/fangjun/open-source-2/icefall-fix','k2-path':'/ceph-fj/fangjun/open-source-2/k2-bpe-500/k2/python/k2/__init__.py','lhotse-path':'/ceph-fj/fangjun/open-source-2/lhotse-bpe-500/lhotse/__init__.py'}}2021-11-1013:43:45,599INFO[pretrained.py:266]device:cuda:02021-11-1013:43:45,599INFO[pretrained.py:268]Creatingmodel2021-11-1013:43:51,833INFO[pretrained.py:285]ConstructingFbankcomputer2021-11-1013:43:51,834INFO[pretrained.py:295]Readingsoundfiles:['./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav','./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav']2021-11-1013:43:51,915INFO[pretrained.py:301]Decodingstarted2021-11-1013:43:52,076INFO[pretrained.py:352]LoadingHLGfrom./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt2021-11-1013:43:55,110INFO[pretrained.py:363]LoadingGfrom./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lm/G_4_gram.pt2021-11-1013:44:09,329INFO[pretrained.py:397]UseHLG+LMrescoring+attentiondecoderrescoring2021-11-1013:44:10,192INFO[pretrained.py:425]./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav:AFTEREARLYNIGHTFALLTHEYELLOWLAMPSWOULDLIGHTUPHEREANDTHERETHESQUALIDQUARTEROFTHEBROTHELS./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav:GODASADIRECTCONSEQUENCEOFTHESINWHICHMANTHUSPUNISHEDHADGIVENHERALOVELYCHILDWHOSEPLACEWASONTHATSAMEDISHONOREDBOSOMTOCONNECTHERPARENTFOREVERWITHTHERACEANDDESCENTOFMORTALSANDTOBEFINALLYABLESSEDSOULINHEAVEN./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav:YETTHESETHOUGHTSAFFECTEDHESTERPRYNNELESSWITHHOPETHANAPPREHENSION2021-11-1013:44:10,192INFO[pretrained.py:427]DecodingDone
We do provide a colab notebook for this recipe showing how to use a pre-trained model.
Hint
Due to limited memory provided by Colab, you have to upgrade to Colab Pro to
run HLGdecoding+LMrescoring and
HLGdecoding+LMrescoring+attentiondecoderrescoring.
Otherwise, you can only run HLGdecoding with Colab.
Congratulations! You have finished the LibriSpeech ASR recipe with
conformer CTC models in icefall.
If you want to deploy your trained model in C++, please read the following section.
$mkdirbuild-release
$cdbuild-release
$cmake-DCMAKE_BUILD_TYPE=Release..
$make-jctc_decodehlg_decodengram_lm_rescoreattention_rescore
# You will find four binaries in `./bin`, i.e.,# ./bin/ctc_decode, ./bin/hlg_decode,# ./bin/ngram_lm_rescore, and ./bin/attention_rescore