Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

kokoro-multi-lang-v1_1

Info about this model

This model is kokoro v1.1-zh and it is from https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh

It supports both Chinese and English.

Number of speakersSample rate
10324000

Meaning of speaker prefix

PrefixMeaningsid rangeNumber of speakers
afAmerican female0 - 12
bfBritish female21
zfChinese female3 - 5755
zmChinese male58 - 10245

speaker ID to speaker name (sid -> name)

The mapping from speaker ID (sid) to speaker name is given below:

0 - 30 -> af_maple1 -> af_sol2 -> bf_vale3 -> zf_001
4 - 74 -> zf_0025 -> zf_0036 -> zf_0047 -> zf_005
8 - 118 -> zf_0069 -> zf_00710 -> zf_00811 -> zf_017
12 - 1512 -> zf_01813 -> zf_01914 -> zf_02115 -> zf_022
16 - 1916 -> zf_02317 -> zf_02418 -> zf_02619 -> zf_027
20 - 2320 -> zf_02821 -> zf_03222 -> zf_03623 -> zf_038
24 - 2724 -> zf_03925 -> zf_04026 -> zf_04227 -> zf_043
28 - 3128 -> zf_04429 -> zf_04630 -> zf_04731 -> zf_048
32 - 3532 -> zf_04933 -> zf_05134 -> zf_05935 -> zf_060
36 - 3936 -> zf_06737 -> zf_07038 -> zf_07139 -> zf_072
40 - 4340 -> zf_07341 -> zf_07442 -> zf_07543 -> zf_076
44 - 4744 -> zf_07745 -> zf_07846 -> zf_07947 -> zf_083
48 - 5148 -> zf_08449 -> zf_08550 -> zf_08651 -> zf_087
52 - 5552 -> zf_08853 -> zf_09054 -> zf_09255 -> zf_093
56 - 5956 -> zf_09457 -> zf_09958 -> zm_00959 -> zm_010
60 - 6360 -> zm_01161 -> zm_01262 -> zm_01363 -> zm_014
64 - 6764 -> zm_01565 -> zm_01666 -> zm_02067 -> zm_025
68 - 7168 -> zm_02969 -> zm_03070 -> zm_03171 -> zm_033
72 - 7572 -> zm_03473 -> zm_03574 -> zm_03775 -> zm_041
76 - 7976 -> zm_04577 -> zm_05078 -> zm_05279 -> zm_053
80 - 8380 -> zm_05481 -> zm_05582 -> zm_05683 -> zm_057
84 - 8784 -> zm_05885 -> zm_06186 -> zm_06287 -> zm_063
88 - 9188 -> zm_06489 -> zm_06590 -> zm_06691 -> zm_068
92 - 9592 -> zm_06993 -> zm_08094 -> zm_08195 -> zm_082
96 - 9996 -> zm_08997 -> zm_09198 -> zm_09599 -> zm_096
100 - 102100 -> zm_097101 -> zm_098102 -> zm_100

speaker name to speaker ID (name -> sid)

The mapping from speaker name to speaker ID (sid) is given below:

0 - 3af_maple -> 0af_sol -> 1bf_vale -> 2zf_001 -> 3
4 - 7zf_002 -> 4zf_003 -> 5zf_004 -> 6zf_005 -> 7
8 - 11zf_006 -> 8zf_007 -> 9zf_008 -> 10zf_017 -> 11
12 - 15zf_018 -> 12zf_019 -> 13zf_021 -> 14zf_022 -> 15
16 - 19zf_023 -> 16zf_024 -> 17zf_026 -> 18zf_027 -> 19
20 - 23zf_028 -> 20zf_032 -> 21zf_036 -> 22zf_038 -> 23
24 - 27zf_039 -> 24zf_040 -> 25zf_042 -> 26zf_043 -> 27
28 - 31zf_044 -> 28zf_046 -> 29zf_047 -> 30zf_048 -> 31
32 - 35zf_049 -> 32zf_051 -> 33zf_059 -> 34zf_060 -> 35
36 - 39zf_067 -> 36zf_070 -> 37zf_071 -> 38zf_072 -> 39
40 - 43zf_073 -> 40zf_074 -> 41zf_075 -> 42zf_076 -> 43
44 - 47zf_077 -> 44zf_078 -> 45zf_079 -> 46zf_083 -> 47
48 - 51zf_084 -> 48zf_085 -> 49zf_086 -> 50zf_087 -> 51
52 - 55zf_088 -> 52zf_090 -> 53zf_092 -> 54zf_093 -> 55
56 - 59zf_094 -> 56zf_099 -> 57zm_009 -> 58zm_010 -> 59
60 - 63zm_011 -> 60zm_012 -> 61zm_013 -> 62zm_014 -> 63
64 - 67zm_015 -> 64zm_016 -> 65zm_020 -> 66zm_025 -> 67
68 - 71zm_029 -> 68zm_030 -> 69zm_031 -> 70zm_033 -> 71
72 - 75zm_034 -> 72zm_035 -> 73zm_037 -> 74zm_041 -> 75
76 - 79zm_045 -> 76zm_050 -> 77zm_052 -> 78zm_053 -> 79
80 - 83zm_054 -> 80zm_055 -> 81zm_056 -> 82zm_057 -> 83
84 - 87zm_058 -> 84zm_061 -> 85zm_062 -> 86zm_063 -> 87
88 - 91zm_064 -> 88zm_065 -> 89zm_066 -> 90zm_068 -> 91
92 - 95zm_069 -> 92zm_080 -> 93zm_081 -> 94zm_082 -> 95
96 - 99zm_089 -> 96zm_091 -> 97zm_095 -> 98zm_096 -> 99
100 - 102zm_097 -> 100zm_098 -> 101zm_100 -> 102

Download the model

Click to expand

Model download address

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2

Android APK

Click to expand

The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2

ABIURL中国镜像
arm64-v8aDownload下载
armeabi-v7aDownload下载
x86_64Download下载
x86Download下载

If you don’t know what ABI is, you probably need to select arm64-v8a.

The source code for the APK can be found at

https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine

Please refer to the documentation for how to build the APK from source code.

More Android APKs can be found at

https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

Python API

Click to expand

Assume you have installed sherpa-onnx via

pip install sherpa-onnx

and you have downloaded the model from

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2

You can use the following code to play with kokoro-multi-lang-v1_1

import sherpa_onnx
import soundfile as sf

config = sherpa_onnx.OfflineTtsConfig(
    model=sherpa_onnx.OfflineTtsModelConfig(
        kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(
            model="kokoro-multi-lang-v1_1/model.onnx",
            voices="kokoro-multi-lang-v1_1/voices.bin",
            tokens="kokoro-multi-lang-v1_1/tokens.txt",
            data_dir="kokoro-multi-lang-v1_1/espeak-ng-data",
            lexicon="kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
        ),
        num_threads=1,
    ),
)

if not config.validate():
    raise ValueError("Please check your config")

tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
                     sid=0,
                     speed=1.0)

sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)

C API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with C API.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sherpa-onnx/c-api/c-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  SherpaOnnxOfflineTtsConfig config;
  memset(&config, 0, sizeof(config));
  config.model.kokoro.model = "kokoro-multi-lang-v1_1/model.onnx";
  config.model.kokoro.voices = "kokoro-multi-lang-v1_1/voices.bin";
  config.model.kokoro.tokens = "kokoro-multi-lang-v1_1/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-multi-lang-v1_1/espeak-ng-data";
  config.model.kokoro.lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  const char *text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);

  SherpaOnnxGenerationConfig gen_cfg;
  memset(&gen_cfg, 0, sizeof(gen_cfg));
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0;

#if 0
  // If you don't want to use a callback, then please enable this branch
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
#else
  const SherpaOnnxGeneratedAudio *audio =
      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
                                             ProgressCallback, NULL);
#endif

  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
                      "./test.wav");

  // You need to free the pointers to avoid memory leak in your app
  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
  SherpaOnnxDestroyOfflineTts(tts);

  printf("Saved to ./test.wav\n");

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.c. Then you can compile it with the following command:

gcc \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.c

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

C++ API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with C++ API.

#include <cstdint>
#include <cstdio>
#include <string>

#include "sherpa-onnx/c-api/cxx-api.h"

static int32_t ProgressCallback(const float *samples, int32_t num_samples,
                                float progress, void *arg) {
  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
  // return 1 to continue generating
  // return 0 to stop generating
  return 1;
}

int32_t main(int32_t argc, char *argv[]) {
  using namespace sherpa_onnx::cxx; // NOLINT
  OfflineTtsConfig config;

  config.model.kokoro.model = "kokoro-multi-lang-v1_1/model.onnx";
  config.model.kokoro.voices = "kokoro-multi-lang-v1_1/voices.bin";
  config.model.kokoro.tokens = "kokoro-multi-lang-v1_1/tokens.txt";
  config.model.kokoro.data_dir = "kokoro-multi-lang-v1_1/espeak-ng-data";
  config.model.kokoro.lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";

  config.model.num_threads = 1;

  // If you don't want to see debug messages, please set it to 0
  config.model.debug = 0;

  std::string filename = "./test.wav";
  std::string text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

  auto tts = OfflineTts::Create(config);

  GenerationConfig gen_cfg;
  gen_cfg.sid = 0;
  gen_cfg.speed = 1.0; // larger -> faster in speech speed

#if 0
  // If you don't want to use a callback, then please enable this branch
  GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
  GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif

  WriteWave(filename, {audio.samples, audio.sample_rate});

  fprintf(stderr, "Input text is: %s\n", text.c_str());
  fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
  fprintf(stderr, "Saved to: %s\n", filename.c_str());

  return 0;
}
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared

cmake \
 -DSHERPA_ONNX_ENABLE_C_API=ON \
 -DCMAKE_BUILD_TYPE=Release \
 -DBUILD_SHARED_LIBS=ON \
 -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
 ..

make
make install

You can find required header file and library files inside /tmp/sherpa-onnx/shared.

Assume you have saved the above example file as /tmp/test-kokoro.cc. Then you can compile it with the following command:

g++ \
  -std=c++17 \
  -I /tmp/sherpa-onnx/shared/include \
  -L /tmp/sherpa-onnx/shared/lib \
  -lsherpa-onnx-cxx-api \
  -lsherpa-onnx-c-api \
  -lonnxruntime \
  -o /tmp/test-kokoro \
  /tmp/test-kokoro.cc

Now you can run

cd /tmp

# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro

You probably need to run

# For Linux
export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH

# For macOS
export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATH

before you run /tmp/test-kokoro.

Please see the documentation at

https://k2-fsa.github.io/sherpa/onnx/c-api/index.html

Rust API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Rust API.

use sherpa_onnx::{
    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,
};

fn main() {
    let config = OfflineTtsConfig {
        model: sherpa_onnx::OfflineTtsModelConfig {
            kokoro: OfflineTtsKokoroModelConfig {
                model: Some("kokoro-multi-lang-v1_1/model.onnx".into()),
                voices: Some("kokoro-multi-lang-v1_1/voices.bin".into()),
                tokens: Some("kokoro-multi-lang-v1_1/tokens.txt".into()),
                data_dir: Some("kokoro-multi-lang-v1_1/espeak-ng-data".into()),
                lexicon: Some("kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt".into()),
                ..Default::default()
            },
            num_threads: 2,
            debug: false,
            ..Default::default()
        },
        ..Default::default()
    };

    let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");

    println!("Sample rate: {}", tts.sample_rate());
    println!("Num speakers: {}", tts.num_speakers());

    let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

    let gen_config = GenerationConfig {
        sid: 0,
        speed: 1.0,
        ..Default::default()
    };

    let audio = tts
        .generate_with_config(
            text,
            &gen_config,
            Some(|_samples: &[f32], progress: f32| -> bool {
                println!("Progress: {:.1}%", progress * 100.0);
                true
            }),
        )
        .expect("Generation failed");

    let filename = "./test.wav";
    if audio.save(filename) {
        println!("Saved to: {}", filename);
    } else {
        eprintln!("Failed to save {}", filename);
    }
}

Please refer to the Rust API documentation for how to build and run the above Rust example.

Node.js (addon) API

Click to expand

You need to install the sherpa-onnx-node npm package first:

npm install sherpa-onnx-node

You can use the following code to play with kokoro-multi-lang-v1_1 with the Node.js addon API.

const sherpa_onnx = require('sherpa-onnx-node');

function createOfflineTts() {
  const config = {
    model: {
      kokoro: {
        model: 'kokoro-multi-lang-v1_1/model.onnx',
        voices: 'kokoro-multi-lang-v1_1/voices.bin',
        tokens: 'kokoro-multi-lang-v1_1/tokens.txt',
        dataDir: 'kokoro-multi-lang-v1_1/espeak-ng-data',
        lexicon: 'kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt',
      },
      debug: true,
      numThreads: 1,
      provider: 'cpu',
    },
    maxNumSentences: 1,
  };
  return new sherpa_onnx.OfflineTts(config);
}

const tts = createOfflineTts();

const text = 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.';

const generationConfig = new sherpa_onnx.GenerationConfig({
  sid: 0,
  speed: 1.0,
  silenceScale: 0.2,
});

let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
    real_time_factor.toFixed(3));

const filename = 'test.wav';
sherpa_onnx.writeWave(
    filename, {samples: audio.samples, sampleRate: audio.sampleRate});

console.log(`Saved to ${filename}`);

Please refer to the Node.js addon API documentation for more details.

Dart API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Dart API.

import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;

void main() {
  final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(
    model: 'kokoro-multi-lang-v1_1/model.onnx',
    voices: 'kokoro-multi-lang-v1_1/voices.bin',
    tokens: 'kokoro-multi-lang-v1_1/tokens.txt',
    dataDir: 'kokoro-multi-lang-v1_1/espeak-ng-data',
  );

  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
    kokoro: kokoro,
    numThreads: 1,
    debug: true,
  );
  final config = sherpa_onnx.OfflineTtsConfig(
    model: modelConfig,
    maxNumSenetences: 1,
  );

  final tts = sherpa_onnx.OfflineTts(config);
  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
    sid: 0,
    speed: 1.0,
    silenceScale: 0.2,
  );
  final audio = tts.generateWithConfig(text: 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', config: genConfig);
  tts.free();

  sherpa_onnx.writeWave(
    filename: 'test.wav',
    samples: audio.samples,
    sampleRate: audio.sampleRate,
  );
  print('Saved to test.wav');
}

Please refer to the Dart API documentation for more details.

Swift API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Swift API.

func run() {
  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(
    model: "kokoro-multi-lang-v1_1/model.onnx",
    voices: "kokoro-multi-lang-v1_1/voices.bin",
    tokens: "kokoro-multi-lang-v1_1/tokens.txt",
    dataDir: "kokoro-multi-lang-v1_1/espeak-ng-data"
  )
  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)
  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)

  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)

  let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."
  var genConfig = SherpaOnnxGenerationConfigSwift()
  genConfig.sid = 0
  genConfig.speed = 1.0
  genConfig.silenceScale = 0.2

  let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
  let filename = "test.wav"
  let ok = audio.save(filename: filename)
  if ok == 1 {
    print("Saved to \(filename)")
  } else {
    print("Failed to save \(filename)")
  }
}

@main
struct App {
  static func main() {
    run()
  }
}

Please refer to the Swift API documentation for more details.

C# API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with C# API.

using SherpaOnnx;

var config = new OfflineTtsConfig();
config.Model.Kokoro.Model = "kokoro-multi-lang-v1_1/model.onnx";
config.Model.Kokoro.Voices = "kokoro-multi-lang-v1_1/voices.bin";
config.Model.Kokoro.Tokens = "kokoro-multi-lang-v1_1/tokens.txt";
config.Model.Kokoro.DataDir = "kokoro-multi-lang-v1_1/espeak-ng-data";
config.Model.Kokoro.Lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;

var tts = new OfflineTts(config);
var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = ;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;

var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");

if (ok)
{
  Console.WriteLine("Saved to ./test.wav");
}
else
{
  Console.WriteLine("Failed to save ./test.wav");
}

Please refer to the C# API documentation for more details.

Kotlin API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Kotlin API.

package com.k2fsa.sherpa.onnx

fun main() {
  var config = OfflineTtsConfig(
    model = OfflineTtsModelConfig(
      kokoro = OfflineTtsKokoroModelConfig(
        model = "kokoro-multi-lang-v1_1/model.onnx",
        voices = "kokoro-multi-lang-v1_1/voices.bin",
        tokens = "kokoro-multi-lang-v1_1/tokens.txt",
        dataDir = "kokoro-multi-lang-v1_1/espeak-ng-data",
        lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
      ),
      numThreads = 1,
      debug = true,
    ),
  )
  val tts = OfflineTts(config = config)
  val genConfig = GenerationConfig(
    sid = ,
    speed = 1.0f,
    silenceScale = 0.2f,
  )
  val audio = tts.generateWithConfigAndCallback(
    text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
    config = genConfig,
    callback = ::callback,
  )
  audio.save(filename = "test.wav")
  tts.release()
  println("Saved to test.wav")
}

fun callback(samples: FloatArray): Int {
  // 1 means to continue
  // 0 means to stop
  return 1
}

Please refer to the Kotlin API documentation for more details.

Java API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Java API.

import com.k2fsa.sherpa.onnx.*;

public class TtsDemo {
  public static void main(String[] args) {
    var kokoro = new OfflineTtsKokoroModelConfig();
    kokoro.setModel("kokoro-multi-lang-v1_1/model.onnx");
    kokoro.setVoices("kokoro-multi-lang-v1_1/voices.bin");
    kokoro.setTokens("kokoro-multi-lang-v1_1/tokens.txt");
    kokoro.setDataDir("kokoro-multi-lang-v1_1/espeak-ng-data");
    kokoro.setLexicon("kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt");

    var modelConfig = new OfflineTtsModelConfig();
    modelConfig.setKokoro(kokoro);
    modelConfig.setNumThreads(1);
    modelConfig.setDebug(true);

    var config = new OfflineTtsConfig();
    config.setModel(modelConfig);
    config.setMaxNumSentences(1);

    var tts = new OfflineTts(config);
    var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";

    var genConfig = new GenerationConfig();
    genConfig.setSid(0);
    genConfig.setSpeed(1.0f);
    genConfig.setSilenceScale(0.2f);

    var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
      // 1 means to continue, 0 means to stop
      return 1;
    });

    audio.save("test.wav");
    tts.release();
    System.out.println("Saved to test.wav");
  }
}

Please refer to the Java API documentation for more details.

Pascal API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Pascal API.

program test_kokoro;

{$mode objfpc}

uses
  SysUtils,
  sherpa_onnx;

var
  Config: TSherpaOnnxOfflineTtsConfig;
  Tts: TSherpaOnnxOfflineTts;
  Audio: TSherpaOnnxGeneratedAudio;
  GenConfig: TSherpaOnnxGenerationConfig;

begin
  FillChar(Config, SizeOf(Config), 0);

  Config.Model.Kokoro.Model := 'kokoro-multi-lang-v1_1/model.onnx';
  Config.Model.Kokoro.Voices := 'kokoro-multi-lang-v1_1/voices.bin';
  Config.Model.Kokoro.Tokens := 'kokoro-multi-lang-v1_1/tokens.txt';
  Config.Model.Kokoro.DataDir := 'kokoro-multi-lang-v1_1/espeak-ng-data';
  Config.Model.Kokoro.Lexicon := 'kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt';
  Config.Model.NumThreads := 1;
  Config.Model.Debug := True;
  Config.MaxNumSentences := 1;

  Tts := TSherpaOnnxOfflineTts.Create(@Config);

  GenConfig.Sid := 0;
  GenConfig.Speed := 1.0;
  GenConfig.SilenceScale := 0.2;

  Audio := Tts.GenerateWithConfig('This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', @GenConfig, nil);

  WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);

  WriteLn('Saved to ./test.wav');

  Audio.Free;
  Tts.Free;
end.

Please refer to the Pascal API documentation for more details.

Go API

Click to expand

You can use the following code to play with kokoro-multi-lang-v1_1 with Go API.

package main

import (
	"fmt"
	sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)

func main() {
	config := sherpa.OfflineTtsConfig{
		Model: sherpa.OfflineTtsModelConfig{
			Kokoro: sherpa.OfflineTtsKokoroModelConfig{
				Model:  "kokoro-multi-lang-v1_1/model.onnx",
				Voices: "kokoro-multi-lang-v1_1/voices.bin",
				Tokens: "kokoro-multi-lang-v1_1/tokens.txt",
				DataDir: "kokoro-multi-lang-v1_1/espeak-ng-data",
				Lexicon: "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
			},
			NumThreads: 1,
			Debug:      true,
		},
		MaxNumSentences: 1,
	}

	tts := sherpa.NewOfflineTts(&config)
	defer tts.Delete()

	text := "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."

	genConfig := sherpa.GenerationConfig{
		Sid:          0,
		Speed:        1.0,
		SilenceScale: 0.2,
	}

	audio := tts.GenerateWithConfig(text, &genConfig, nil)

	filename := "./test.wav"
	sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)

	fmt.Printf("Saved to %s\n", filename)
}

Please refer to the Go API documentation for more details.

Samples

For the following text:

This model supports both Chinese and English. 小米的核心价值观是什么?答案
是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习.
我在研究 machine learning。What do you think 中英文说的如何呢?
今天是 2025年6月18号.

sample audios for different speakers are listed below:

Speaker 0 - af_maple

Speaker 1 - af_sol

Speaker 2 - bf_vale

Speaker 3 - zf_001

Speaker 4 - zf_002

Speaker 5 - zf_003

Speaker 6 - zf_004

Speaker 7 - zf_005

Speaker 8 - zf_006

Speaker 9 - zf_007

Speaker 10 - zf_008

Speaker 11 - zf_017

Speaker 12 - zf_018

Speaker 13 - zf_019

Speaker 14 - zf_021

Speaker 15 - zf_022

Speaker 16 - zf_023

Speaker 17 - zf_024

Speaker 18 - zf_026

Speaker 19 - zf_027

Speaker 20 - zf_028

Speaker 21 - zf_032

Speaker 22 - zf_036

Speaker 23 - zf_038

Speaker 24 - zf_039

Speaker 25 - zf_040

Speaker 26 - zf_042

Speaker 27 - zf_043

Speaker 28 - zf_044

Speaker 29 - zf_046

Speaker 30 - zf_047

Speaker 31 - zf_048

Speaker 32 - zf_049

Speaker 33 - zf_051

Speaker 34 - zf_059

Speaker 35 - zf_060

Speaker 36 - zf_067

Speaker 37 - zf_070

Speaker 38 - zf_071

Speaker 39 - zf_072

Speaker 40 - zf_073

Speaker 41 - zf_074

Speaker 42 - zf_075

Speaker 43 - zf_076

Speaker 44 - zf_077

Speaker 45 - zf_078

Speaker 46 - zf_079

Speaker 47 - zf_083

Speaker 48 - zf_084

Speaker 49 - zf_085

Speaker 50 - zf_086

Speaker 51 - zf_087

Speaker 52 - zf_088

Speaker 53 - zf_090

Speaker 54 - zf_092

Speaker 55 - zf_093

Speaker 56 - zf_094

Speaker 57 - zf_099

Speaker 58 - zm_009

Speaker 59 - zm_010

Speaker 60 - zm_011

Speaker 61 - zm_012

Speaker 62 - zm_013

Speaker 63 - zm_014

Speaker 64 - zm_015

Speaker 65 - zm_016

Speaker 66 - zm_020

Speaker 67 - zm_025

Speaker 68 - zm_029

Speaker 69 - zm_030

Speaker 70 - zm_031

Speaker 71 - zm_033

Speaker 72 - zm_034

Speaker 73 - zm_035

Speaker 74 - zm_037

Speaker 75 - zm_041

Speaker 76 - zm_045

Speaker 77 - zm_050

Speaker 78 - zm_052

Speaker 79 - zm_053

Speaker 80 - zm_054

Speaker 81 - zm_055

Speaker 82 - zm_056

Speaker 83 - zm_057

Speaker 84 - zm_058

Speaker 85 - zm_061

Speaker 86 - zm_062

Speaker 87 - zm_063

Speaker 88 - zm_064

Speaker 89 - zm_065

Speaker 90 - zm_066

Speaker 91 - zm_068

Speaker 92 - zm_069

Speaker 93 - zm_080

Speaker 94 - zm_081

Speaker 95 - zm_082

Speaker 96 - zm_089

Speaker 97 - zm_091

Speaker 98 - zm_095

Speaker 99 - zm_096

Speaker 100 - zm_097

Speaker 101 - zm_098

Speaker 102 - zm_100