Conformer transducer based non-streaming ASR

This page describes how to use sherpa for non-streaming ASR based on Conformer transducer.

We use pre-trained models using the following datasets for demonstration:

aishell

LibriSpeech

aishell is a Chinese dataset and its pre-trained model uses Chinese characters as modeling units; its vocabulary size is 4336.

LibriSpeech is an English dataset; its pre-trained model uses BPE as modeling units with vocabulary size 500.

For the demo of each dataset below, we describe the usage of the server as well as the client. You can also find pre-trained models provided by us in each demo so that you can play with it without any training.