Non-streaming WebSocket server and client

Hint

Please refer to Installation to install sherpa-onnx before you read this section.

Build sherpa-onnx with WebSocket support

By default, it will generate the following binaries after Installation:

sherpa-onnx fangjun$ ls -lh build/bin/*websocket*
-rwxr-xr-x  1 fangjun  staff   1.1M Mar 31 22:09 build/bin/sherpa-onnx-offline-websocket-server
-rwxr-xr-x  1 fangjun  staff   1.0M Mar 31 22:09 build/bin/sherpa-onnx-online-websocket-client
-rwxr-xr-x  1 fangjun  staff   1.2M Mar 31 22:09 build/bin/sherpa-onnx-online-websocket-server

Please refer to Streaming WebSocket server and client for the usage of sherpa-onnx-online-websocket-server and sherpa-onnx-online-websocket-client.

View the server usage

Before starting the server, let us view the help message of sherpa-onnx-offline-websocket-server:

build/bin/sherpa-onnx-offline-websocket-server

The above command will print the following help information:

Automatic speech recognition with sherpa-onnx using websocket.

Usage:

./bin/sherpa-onnx-offline-websocket-server --help

(1) For transducer models

./bin/sherpa-onnx-offline-websocket-server \
  --port=6006 \
  --num-work-threads=5 \
  --tokens=/path/to/tokens.txt \
  --encoder=/path/to/encoder.onnx \
  --decoder=/path/to/decoder.onnx \
  --joiner=/path/to/joiner.onnx \
  --log-file=./log.txt \
  --max-batch-size=5

(2) For Paraformer

./bin/sherpa-onnx-offline-websocket-server \
  --port=6006 \
  --num-work-threads=5 \
  --tokens=/path/to/tokens.txt \
  --paraformer=/path/to/model.onnx \
  --log-file=./log.txt \
  --max-batch-size=5

Please refer to
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
for a list of pre-trained models to download.

Options:
  --log-file                  : Path to the log file. Logs are appended to this file (string, default = "./log.txt")
  --max-utterance-length      : Max utterance length in seconds. If we receive an utterance longer than this value, we will reject the connection. If you have enough memory, you can select a large value for it. (float, default = 300)
  --decoding-method           : decoding method,Valid values: greedy_search. (string, default = "greedy_search")
  --num-threads               : Number of threads to run the neural network (int, default = 2)
  --feat-dim                  : Feature dimension. Must match the one expected by the model. (int, default = 80)
  --port                      : The port on which the server will listen. (int, default = 6006)
  --debug                     : true to print model information while loading it. (bool, default = false)
  --joiner                    : Path to joiner.onnx (string, default = "")
  --tokens                    : Path to tokens.txt (string, default = "")
  --encoder                   : Path to encoder.onnx (string, default = "")
  --num-work-threads          : Thread pool size for for neural network computation and decoding. (int, default = 3)
  --paraformer                : Path to model.onnx of paraformer. (string, default = "")
  --num-io-threads            : Thread pool size for network connections. (int, default = 1)
  --max-batch-size            : Max batch size for decoding. (int, default = 5)
  --decoder                   : Path to decoder.onnx (string, default = "")

Standard options:
  --help                      : Print out usage message (bool, default = false)
  --print-args                : Print the command line arguments (to stderr) (bool, default = true)
  --config                    : Configuration file to read (this option may be repeated) (string, default = "")


Start the server

Hint

Please refer to Pre-trained models for a list of pre-trained models.

Start the server with a transducer model

./build/bin/sherpa-onnx-offline-websocket-server \
  --port=6006 \
  --num-work-threads=5 \
  --tokens=./sherpa-onnx-zipformer-en-2023-03-30/tokens.txt \
  --encoder=./sherpa-onnx-zipformer-en-2023-03-30/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-zipformer-en-2023-03-30/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-zipformer-en-2023-03-30/joiner-epoch-99-avg-1.onnx \
  --log-file=./log.txt \
  --max-batch-size=5

Caution

The arguments are in the form --key=value.

It does not support --key value.

It does not support --key value.

It does not support --key value.

Hint

In the above demo, the model files are from csukuangfj/sherpa-onnx-zipformer-en-2023-03-30 (English).

Note

Note that the server supports processing multiple clients in a batch in parallel. You can use --max-batch-size to limit the batch size.

Start the server with a paraformer model

./build/bin/sherpa-onnx-offline-websocket-server \
  --port=6006 \
  --num-work-threads=5 \
  --tokens=./sherpa-onnx-paraformer-zh-2023-03-28/tokens.txt \
  --paraformer=./sherpa-onnx-paraformer-zh-2023-03-28/model.onnx \
  --log-file=./log.txt \
  --max-batch-size=5

Hint

In the above demo, the model files are from csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28 (Chinese + English).

Start the client (Python)

We provide two clients written in Python:

offline-websocket-client-decode-files-paralell.py

python3 ./python-api-examples/offline-websocket-client-decode-files-paralell.py \
  --server-addr localhost \
  --server-port 6006 \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/0.wav \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/1.wav \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/2.wav \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/8k.wav

offline-websocket-client-decode-files-sequential.py

python3 ./python-api-examples/offline-websocket-client-decode-files-sequential.py \
  --server-addr localhost \
  --server-port 6006 \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/0.wav \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/1.wav \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/2.wav \
  ./sherpa-onnx-paraformer-zh-2023-03-28/test_wavs/8k.wav