icefall

Contents:

  • Icefall for dummies tutorial
  • Installation
  • Docker
  • Frequently Asked Questions (FAQs)
  • Model export
  • FST-based forced alignment
    • Two approaches
    • Kaldi-based forced alignment
    • k2-based forced alignment
  • Recipes
  • Contributing
  • Huggingface
  • Decoding with language models
icefall
  • FST-based forced alignment
  • Edit on GitHub

FST-based forced alignment

This section describes how to perform FST-based forced alignment with models trained by CTC loss.

We use CTC FORCED ALIGNMENT API TUTORIAL from torchaudio as a reference in this section.

Different from torchaudio, we use an FST-based approach.

Contents:

  • Two approaches
    • Differences between the two approaches
  • Kaldi-based forced alignment
    • Prepare the environment
    • Get the test data
    • Compute log_probs
    • Create token2id and id2token
    • Create word2id and id2word
    • Generate lexicon-related files
    • Convert transcript to an FST graph
    • Force aligner
    • Segment each word using the computed alignments
    • Summary
  • k2-based forced alignment
Previous Next

© Copyright 2021, icefall development team.

Built with Sphinx using a theme provided by Read the Docs.