icefall
Contents:
Icefall for dummies tutorial
Installation
Docker
Frequently Asked Questions (FAQs)
Model export
FST-based forced alignment
Recipes
Contributing
Huggingface
Decoding with language models
icefall
Icefall
Edit on GitHub
Icefall
Documentation for
icefall
, containing speech recognition recipes using
k2
.
Contents:
Icefall for dummies tutorial
Environment setup
Data Preparation
Training
Decoding
Model Export
Installation
(0) Install CUDA toolkit and cuDNN
(1) Install torch and torchaudio
(2) Install k2
(3) Install lhotse
(4) Download icefall
Installation example
Test Your Installation
YouTube Video
Docker
Introduction
View available tags
Download a docker image (CUDA)
Download a docker image (CPU)
Run a docker image with GPU
Run a docker image with CPU
Run yesno within a docker container
Frequently Asked Questions (FAQs)
OSError: libtorch_hip.so: cannot open shared object file: no such file or directory
AttributeError: module ‘distutils’ has no attribute ‘version’
ImportError: libpython3.10.so.1.0: cannot open shared object file: No such file or directory
Model export
Export model.state_dict()
Export model with torch.jit.trace()
Export model with torch.jit.script()
Export to ONNX
Export to ncnn
FST-based forced alignment
Two approaches
Kaldi-based forced alignment
k2-based forced alignment
Recipes
Non Streaming ASR
aishell
LibriSpeech
TIMIT
YesNo
Streaming ASR
Introduction
LibriSpeech
RNN-LM
Train an RNN language model
TTS
VITS-LJSpeech
VITS-VCTK
Fine-tune a pre-trained model
Finetune from a supervised pre-trained Zipformer model
Finetune from a pre-trained Zipformer model with adapters
Contributing
Contributing to Documentation
Follow the code style
How to create a recipe
Huggingface
Pre-trained models
Huggingface spaces
Decoding with language models
Shallow fusion for Transducer
LODR for RNN Transducer
LM rescoring for Transducer