How to create a recipe

Hint

Please read Follow the code style to adjust your code style.

Caution

icefall is designed to be as Pythonic as possible. Please use Python in your recipe if possible.

Data Preparation

We recommend you to prepare your training/test/validate dataset with lhotse.

Please refer to https://lhotse.readthedocs.io/en/latest/index.html for how to create a recipe in lhotse.

Hint

The yesno recipe in lhotse is a very good example.

Please refer to https://github.com/lhotse-speech/lhotse/pull/380, which shows how to add a new recipe to lhotse.

Suppose you would like to add a recipe for a dataset named foo. You can do the following:

$ cd egs
$ mkdir -p foo/ASR
$ cd foo/ASR
$ touch prepare.sh
$ chmod +x prepare.sh

If your dataset is very simple, please follow egs/yesno/ASR/prepare.sh to write your own prepare.sh. Otherwise, please refer to egs/librispeech/ASR/prepare.sh to prepare your data.

Training

Assume you have a fancy model, called bar for the foo recipe, you can organize your files in the following way:

$ cd egs/foo/ASR
$ mkdir bar
$ cd bar
$ touch README.md model.py train.py decode.py asr_datamodule.py pretrained.py

For instance , the yesno recipe has a tdnn model and its directory structure looks like the following:

egs/yesno/ASR/tdnn/
|-- README.md
|-- asr_datamodule.py
|-- decode.py
|-- model.py
|-- pretrained.py
`-- train.py

File description:

  • README.md

    It contains information of this recipe, e.g., how to run it, what the WER is, etc.

  • asr_datamodule.py

    It provides code to create PyTorch dataloaders with train/test/validation dataset.

  • decode.py

    It takes as inputs the checkpoints saved during the training stage to decode the test dataset(s).

  • model.py

    It contains the definition of your fancy neural network model.

  • pretrained.py

    We can use this script to do inference with a pre-trained model.

  • train.py

    It contains training code.

Hint

Please take a look at

to get a feel what the resulting files look like.

Note

Every model in a recipe is kept to be as self-contained as possible. We tolerate duplicate code among different recipes.

The training stage should be invocable by:

$ cd egs/foo/ASR
$ ./bar/train.py
$ ./bar/train.py --help

Decoding

Please refer to

The decoding stage should be invocable by:

$ cd egs/foo/ASR
$ ./bar/decode.py
$ ./bar/decode.py --help

Pre-trained model

Please demonstrate how to use your model for inference in egs/foo/ASR/bar/pretrained.py. If possible, please consider creating a Colab notebook to show that.