Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

kitten-nano-en-v0_1-fp16

Info about this model

This model is kitten-tts-nano-0.1 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.1

It supports only English.

Number of speakersSample rate
824000

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2

Android APK

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.12.10

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don't know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Meaning of speaker suffix

SuffixMeaning
fFemale
mMale

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 10 -> expr-voice-2-m1 -> expr-voice-2-f
2 - 32 -> expr-voice-3-m3 -> expr-voice-3-f
4 - 54 -> expr-voice-4-m5 -> expr-voice-4-f
6 - 76 -> expr-voice-5-m7 -> expr-voice-5-f

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 1expr-voice-2-m -> 0expr-voice-2-f -> 1
2 - 3expr-voice-3-m -> 2expr-voice-3-f -> 3
4 - 5expr-voice-4-m -> 4expr-voice-4-f -> 5
6 - 7expr-voice-5-m -> 6expr-voice-5-f -> 7

C API

You can use the following code to play with kitten-nano-en-v0_1-fp16 with C API.

#include <stdio.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

int main() {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));

  config.model.kitten.model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx";
  config.model.kitten.voices = "./kitten-nano-en-v0_1-fp16/voices.bin";
  config.model.kitten.tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt";
  config.model.kitten.data_dir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data";

  config.model.num_threads = 1;
  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  int sid = 0; // speaker id
  const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";

  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerate(tts, text, sid, 1.0);

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}

In the following, we describe how to compile and run the above C example.

cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake  -DSHERPA_ONNX_ENABLE_C_API=ON  -DCMAKE_BUILD_TYPE=Release  -DBUILD_SHARED_LIBS=ON  -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared  ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kitten.c. Then you can compile it with the following command:

gcc   -I /tmp/sherpa-onnx/shared/include   -L /tmp/sherpa-onnx/shared/lib   -lsherpa-onnx-c-api   -lonnxruntime   -o /tmp/test-kitten   /tmp/test-kitten.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kitten

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kitten.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Python API

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/./kitten-nano-en-v0_1-fp16.tar.bz2

You can use the following code to play with ./kitten-nano-en-v0_1-fp16

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
            model="./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
            voices="./kitten-nano-en-v0_1-fp16/voices.bin",
            tokens="./kitten-nano-en-v0_1-fp16/tokens.txt",
            data_dir="./kitten-nano-en-v0_1-fp16/espeak-ng-data",
        ),
        num_threads=2,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

Samples

For the following text:

Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.

sample audios for different speakers are listed below:

Speaker 0 - expr-voice-2-m

Speaker 1 - expr-voice-2-f

Speaker 2 - expr-voice-3-m

Speaker 3 - expr-voice-3-f

Speaker 4 - expr-voice-4-m

Speaker 5 - expr-voice-4-f

Speaker 6 - expr-voice-5-m

Speaker 7 - expr-voice-5-f